Conference PaperPDF Available

When Are Smells Indicators of Architectural Refactoring Opportunities? A Study of 50 Software Projects

Authors:

Abstract and Figures

Refactoring is a widely adopted practice for improving code comprehension and for removing severe structural problems in a project. When refactorings affect the system architecture, they are called architectural refactorings. Unfortunately, developers usually do not know when and how they should apply refactorings to remove architectural problems. Nevertheless, they might be more susceptible to applying architectural refactoring if they rely on code smells and code refactoring-two concepts that they usually deal with through their routine programming activities. To investigate if smells can serve as indicators of architectural refactoring opportunities, we conducted a retrospective study over the commit history of 50 software projects. We analyzed 52,667 refactored elements to investigate if they had architectural problems that could have been indicated by automatically-detected smells. We considered purely structural refactorings to identify elements that were likely to have architectural problems. We found that the proportion of refactored elements without smells is much lower than those refactored with smells. By analyzing the latter, we concluded that smells can be used as indicators of architectural refactoring opportunities when the affected source code is deteriorated, i.e., the code hosting two or more smells. For example, when God Class or Complex Class appear together with other smells, they are indicators of architectural refactoring opportunities. In general, smells that often co-occurred with other smells (67.53%) are indicators of architectural refactoring opportunities in most cases (88.53% of refactored elements). Our study also enables us to derive a catalog with patterns of smells that indicate refactoring opportunities to remove specific types of architectural problems. These patterns can guide developers and make them more susceptible to apply architectural refactorings.
Content may be subject to copyright.
When Are Smells Indicators of Architectural Refactoring
Opportunities? A Study of 50 Soware Projects
Leonardo Sousa
ECE - Carnegie Mellon University
leo.sousa@sv.cmu.edu
Willian Oizumi
PUC-Rio, Brazil
woizumi@inf.puc-rio.br
Alessandro Garcia
PUC-Rio, Brazil
afgarcia@inf.puc-rio.br
Anderson Oliveira
PUC-Rio, Brazil
aoliveira@inf.puc-rio.br
Diego Cedrim
Amazon, Brazil
dccedrim@amazon.com
Carlos Lucena
PUC-Rio, Brazil
lucena@inf.puc-rio.br
ABSTRACT
Refactoring is a widely adopted practice for improving code compre-
hension and for removing severe structural problems in a project.
When refactorings aect the system architecture, they are called
architectural refactorings. Unfortunately, developers usually do not
know when and how they should apply refactorings to remove ar-
chitectural problems. Nevertheless, they might be more susceptible
to applying architectural refactoring if they rely on code smells and
code refactoring – two concepts that they usually deal with through
their routine programming activities. To investigate if smells can
serve as indicators of architectural refactoring opportunities, we
conducted a retrospective study over the commit history of 50
software projects. We analyzed 52,667 refactored elements to in-
vestigate if they had architectural problems that could have been
indicated by automatically-detected smells. We considered purely
structural refactorings to identify elements that were likely to have
architectural problems. We found that the proportion of refactored
elements without smells is much lower than those refactored with
smells. By analyzing the latter, we concluded that smells can be
used as indicators of architectural refactoring opportunities when
the aected source code is deteriorated, i.e., the code hosting two
or more smells. For example, when God Class or Complex Class ap-
pear together with other smells, they are indicators of architectural
refactoring opportunities. In general, smells that often co-occurred
with other smells (67.53%) are indicators of architectural refactoring
opportunities in most cases (88.53% of refactored elements). Our
study also enables us to derive a catalog with patterns of smells
that indicate refactoring opportunities to remove specic types of
architectural problems. These patterns can guide developers and
make them more susceptible to apply architectural refactorings.
CCS CONCEPTS
Software and its engineering Software design engineer-
ing.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea
©2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7958-8/20/05.. .$15.00
https://doi.org/10.1145/3387904.3389276
ACM Reference Format:
Leonardo Sousa, Willian Oizumi, Alessandro Garcia, Anderson Oliveira,
Diego Cedrim, and Carlos Lucena. 2020. When Are Smells Indicators of
Architectural Refactoring Opportunities? A Study of 50 Software Projects. In
28th International Conference on Program Comprehension (ICPC ’20), October
5–6, 2020, Seoul, Republic of Korea. ACM, New York, NY, USA, 12 pages.
https://doi.org/10.1145/3387904.3389276
1 INTRODUCTION
Software systems invariably undergo changes that can compromise
their structural quality. If these changes are carried out recklessly,
software systems may reach a degraded state that requires either sig-
nicant maintenance eort or the complete redesign [
12
,
21
,
43
,
54
].
Since these changes are driven by architectural decisions, the in-
troduction of architectural problems is a reason why systems can
degrade [
21
,
54
]. An architectural problem is the result of one or
more inappropriate architectural decisions that negatively impact
non-functional requirements [
6
,
45
,
50
]. Examples include Fat In-
terface, Scattered Concern and Component Overload [10, 11, 26].
Empirical studies showed that architectural problems are the
most important source of technical debt [
7
,
8
,
44
]. A study with
745 software projects showed that technical debt associated to ar-
chitectural problems was directly related to a signicant increase
in software project costs [
7
]. Given the harmfulness of architec-
tural problems, developers should remove them as early as possible.
Refactoring, which consists of a program transformation used for
improving the system structural quality, is the most common prac-
tice to remove architectural problems [9].
As architectural problems are the result of inappropriate architec-
tural decisions, refactorings should aect the system architecture to
remove them [
17
,
47
]. Refactorings that also aect the system archi-
tecture are known as architectural refactorings [
19
,
39
,
41
,
47
,
61
]. An
architectural refactoring comprises one or more code-level refactor-
ings aimed to remove architectural problems [
60
,
61
]. For example,
a Fat Interface occurs when a developer aggregates non-cohesive
functionalities into a single interface [
26
]. This problem can be re-
moved through an architectural refactoring that combines Extract
Interface,Extract Superclass, and Move Method refactorings [9, 26].
Despite the benets of architectural refactoring, developers have
not fully adopted it [
8
,
15
,
61
]. Some artifacts (e.g., architectural doc-
umentation) that could guide developers to identify architectural
refactoring opportunities are unavailable or outdated [
14
,
50
,
61
],
which contributes to its low adoption. Furthermore, developers
perceive architectural refactorings as costly and risky [
8
,
15
] since
ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
they are not condent in recognizing architectural refactoring op-
portunities, specially without automated support. Consequently,
they do not know when and how they should apply architectural
refactorings to remove architectural problems. In this scenario, we
hypothesize that developers might be more susceptible in apply-
ing architectural refactoring if they rely on two familiar concepts
that they have on their disposal: code smells and code refactoring
[45, 46].
A code smell is a structure in the system implementation that
represents a surface indication of architectural problems [
9
]. De-
velopers have been using smells both as indicators of architectural
problems (e.g., [
24
,
29
,
31
,
36
,
42
,
49
]) as well as hints for code refac-
toring [
9
]. Therefore, developers could also use them to identify
architectural refactoring opportunities. For example, classes that
implement a Fat Interface usually have smells such as Divergent
Change, and Feature Envy. These two smells are also hints for devel-
opers to apply Extract Interface,Extract Superclass, or Move Method
refactorings [
9
], which are exactly the architectural refactorings
used to remove the Fat Interface [
9
,
26
]. If developers are aware of
the smells that indicate architectural problems and the code refac-
torings to remove the smells, then they may feel more susceptible
to apply architectural refactoring. In this scenario, they will know
which smells indicate an architectural problem and which code
refactorings should be applied to remove the smells. Consequently,
they may remove the architectural problem by removing the smells.
Unfortunately, smells do not always indicate architectural prob-
lems [
22
]. Knowing when smells indicate architectural refactor-
ing opportunities is essential to encourage developers in applying
architectural refactoring. This knowledge can improve the state-
of-the-art tools that support this activity. For example, a tool can
help developers to remove architectural problems by showing in
their code (i) the smells that indicate the problem, and, (ii) a ranked
list of alternative sets of code refactorings to remove the smells,
and, consequently, (iii) removing the architectural problem. Given
this context, we investigate when smells can provide to developers
opportunities to apply architectural refactorings that might remove
architectural problems.
For this investigation, we identied architectural refactorings
related to the removal of architectural problems. For this purpose,
we located the code refactorings that occurred in the commit his-
tory (147,736 commits) of 50 software projects. Dierently from
other studies [
2
,
4
,
5
], we focused on analyzing purely structural
code refactorings (also called root-canal refactoring). These code
refactorings are those primarily targeted at improving the archi-
tecture structure [
9
,
30
], thereby contributing to the partial or full
removal of one or more architectural problems. After locating the
refactored elements, we conducted a manual analysis to identify
when developers applied code refactorings to remove architectural
problems, i.e., when they applied architectural refactoring. Then,
we analyzed if the elements had smells that could indicate architec-
tural problems. We used this analysis to nd when smells indicate
architectural refactoring opportunities.
We analyzed 52,667 refactored elements. We found that in most
cases (59.48% of the refactored elements) smells are likely to indi-
cate architectural refactoring opportunities. To conrm such result,
we analyzed 1,168 root-canal refactorings. We were able to nd
examples of when smells indicate architectural refactoring oppor-
tunities that have not been reported in the literature. We were also
able to complement and better explain the few examples from the
literature. For instance, most related studies agree that God Class
is an indicator of architectural refactoring for removing an archi-
tectural problem [
1
,
36
,
58
]. Our results showed that not only God
Class is an indicator of architectural refactoring but also Complex
Class. Moreover their likelihood of being indicators increases when
they occur with other smells. We also found cases where smells
cannot be used as indicators of architectural refactoring opportu-
nities. Finally, we found patterns of smells that can often indicate
architectural refactoring for a specic type of architectural problem.
Most of these patterns have not been presented elsewhere.
2 BACKGROUND AND TERMINOLOGY
We discuss here basic concepts and terminology used in this study.
2.1 Smells Indicating Architectural Problems
Code smells are indicators of the occurrence of architectural prob-
lems in the source code [
9
]. For instance, the Unused Abstraction
problem happens when the code element representing the abstrac-
tion is not directly used or is unreachable [
3
,
20
]. Usually, this
architectural problem manifests in the source code due to modi-
cations that make code elements obsolete. This problem can be
identied by the Speculative Generality – a code smell that indicates
an element, usually a class, that was created to support anticipated
future features that never have been implemented [9].
Other examples of smell types include Long Method and God
Class [
9
,
18
]. Studies based on smells use them as indicators of
architectural problems since each smell may be fully or partially
associated with an architectural problem [
29
,
31
,
48
,
55
]. Let us
consider the Figure 1 to illustrate how smells can be signs of archi-
tectural problems. This gure uses an UML-like notation to show a
partial view of the Health Watcher system [
13
]: a web-based system
for improving the quality of services that health vigilance institu-
tions provide. The IFacade is aected by the Fat Interface problem,
represented in the gure by the puzzle symbol. This interface de-
clares methods to access three non-cohesive services, represented
in the gure by shades of blue.
Facade
HealthWatcherFacade
+ insertEmployee(): void
+ updateEmployee(): void
+ insertComplaint(): void
+ updateComplaint(): void
+ insertHealthUnit(): void
+ updateHealthUnit(): void
<<Interface>>
IFacade
+ insertEmployee(): void
+ updateEmployee(): void
+ insertComplaint(): void
+ updateComplaint(): void
+ insertHealthUnit(): void
+ updateHealthUnit(): void
Employee
InsertEmployee UpdateEmployee
Complaint
InsertComplaint UpdateComplaint
HealthUnit
InsertHealthUnit UpdateHealthUnit
Figure 1: Example of Architectural Problem
Classes in Employee, Complaint and HealthUnit packages are
clients of the IFacade interface and contain smells such as Divergent
Change,Feature Envy and Dispersed Coupling.Divergent Change
appears in the classes since they change whenever a change is made
in one of the three services. Dispersed Coupling appears because
the classes are all connected through the interface. Furthermore,
When Are Smells Indicators of Architectural Refactoring Opportunities? A Study of 50 Soware Projects ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea
some of the methods in these classes may contain Feature Envy
since they are more interested in other classes than the one related
to its own service. Classes that implement IFacade also contain
smells. As the interface declares more than one responsibility (i.e.,
one service), it forces other classes to implement more than one as
well. Consequently, classes such as HealthWatcherFacade contain
God Class and Feature Envy smells. God Class emerges because the
class implements more than one responsibility.
In this scenario, the combination of these smells indicates the
architectural problem. Thus, a reasonable assumption is to expect
that code refactorings to remove the smells will also remove the
architectural problem. Therefore, developers could use the smells as
hints to apply architectural refactoring. Unfortunately, as discussed
before, some smells do not indicate architectural problems. Thus,
developers cannot use them to apply architectural refactoring; after
all, there is no architectural problem for developers to remove.
2.2 Refactorings for Architectural Problems
Refactoring is a popular technique to remove architectural problems
from a system. Refactoring is dened as a program transformation
intended at preserving the observable behavior and improving
the program structure [
9
]. Although this denition entails an ex-
pectation that refactoring always preserve the system behavior,
depending on the tactic applied, this is not always the case [
30
].
We describe next the tactics and other basic concepts.
Refactoring Type
indicates the form of the transformation
applied to attributes, methods, classes and interfaces. We considered
the 13 refactoring types [
9
]: Extract Interface,Extract Method,Extract
Superclass,Inline Method,Move Class,Move Field,Move Method,Pull
Up Field,Pull Up Method,Push Down Field,Push Down Method,
Rename Class,Rename Method. These types comprise the most
popular refactoring types [
30
] and they are directly associated with
the removal of architectural problems. For this study, we considered
the architectural problems in Table 1. The rst column presents the
type of architectural problem, and the second one contains a brief
denition about it. In the context of this study, we focused on a
set of architectural problems that may be indicated by code smells
and that may be removed through refactorings. In the example of
Figure 1, refactorings such as Extract Interface,Extract Class, and
Move Method could be applied to remove the Fat Interface.
Refactored Elements
comprise all elements that refactorings
directly aect. Each refactoring type aects directly dierent ele-
ments [
9
]. For example, the Move Method moves a method
m
from
class
A
to
B
. Thus, the refactored elements are: m,Aand B. Al-
though other elements can be indirectly aected by the refactoring,
we consider only these three elements as refactored ones.
Refactoring Tactics
indicate the two main tactics that develop-
ers follow during refactoring [
30
]: root-canal refactoring and oss
refactoring. Root-canal refactoring is applied to repair deteriorated
code, and it involves a process of exclusively applying structural
transformations. As the goal of this tactic is to repair deteriorated
code, there is a high chance of this tactic to be applied to remove ar-
chitectural problems. For instance, to get rid of Misplaced Concern,
a developer can apply a Move Method and Move Field to move the
misplaced functionality to the class to which it belongs. Conversely,
Table 1: List of Architectural Problems
Architectural Problem Denition
Ambiguous
Interface
Architectural problem that happens when a component interface
is ambiguous and provides non-cohesive services [10]
Cyclic
Dependency
Architectural problem that happens when one or more
elements depend on each other, creating a cycle [37]
Component
Overload
Architectural problem that happens when a component is
overloaded with responsibilities [23]
Concern
Overload
Architectural problem that happens when a code element
fullls too many responsibilities [23]
Fat
Interface
Architectural problem that happens when an interface exposes
many funcionalities and many of those functionalities are
not related to each other [26]
Incomplete
Abstraction
Architectural problem that happens when an element does
not support a responsibility completely in their
enclosing component [35]
Misplaced
Concern
Architectural problem that happens when an element
implements a functionality, which is not the
predominant one of their enclosing component [23]
Scattered
Concern
Architectural problem that happens when elements are
responsible for the same functionality, but some of
them cross-cut the system [10]
Unused
Abstraction
Architectural problem that happens when the code element representing
the abstraction that
is not directly used or is unreachable [3]
Unwanted
Dependency
Architectural problem that happens when a dependency violates
a rule dened on the system architecture [38]
oss refactoring is applied to achieve another objective that is dif-
ferent from structural improvements, such as adding features or
xing bugs. For example, a developer may need to apply the Push
Down Field before xing a bug related to hierarchy.
Architectural Refactoring
occurs when developers apply refac-
torings that aect the system architecture. Examples include the
introduction and removal of hierarchies, elimination of dependen-
cies, retrotting a design pattern, moving a large portion of the
source code into smaller modules, and the like [
19
,
61
]. For these
scenarios, the developers have in mind a desired high-level archi-
tecture, in which they have to apply a series of low-level code
refactorings to achieve the desired architectural impact [19].
3 STUDY DESIGN
Researchers have proposed some solutions to help developers in
applying architectural refactoring to remove architectural problems
[
17
,
19
,
41
,
61
]. However, some of them fell short in helping devel-
opers identify and apply architectural refactoring. A reason for this
is that developers usually do not have access to artifacts that could
guide them in identifying architectural refactoring opportunities,
such as the architectural documentation. In fact, identifying archi-
tectural refactoring opportunities in a system is far from trivial [
45
]
as it requires a broad understanding of the system structure and its
functionalities. To help developers with such an identication task,
we investigated if code smells can be used to indicate architectural
refactoring opportunities to remove architectural problems.
3.1 Research Question
As the architectural documentation is usually unavailable, develop-
ers have to directly analyze the source code to identify and remove
architectural problems through architectural refactoring [
45
]. In
this scenario, we hypothesize that developers can increase their
chances of adopting architectural refactoring if they rely on activi-
ties that they are familiar with. The rst activity is the detection
of code smells. According to the literature, not only developers are
familiar with code smells [
58
], but they also use them to identify
ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
architectural problems [
45
,
46
]. Refactoring is the second activity
that developers are familiar with. Usually, they rely on the presence
of smells to identify low-level code refactorings [
9
]. Thus, if they
already use smells to identify architectural problems and to apply
code refactoring, they should be able to benet from using smells
to identify and apply architectural refactoring. As not all smells are
related to an architectural problem, we need to investigate when
smells are indicators of architectural refactoring opportunities. This
investigation is the focus of our research question:
RQ.
When are smells indicators of architectural refactoring
opportunities for developers?
To answer our RQ, we investigate the occurrence of architec-
tural problems that underwent repairing actions by developers. We
searched for elements that had been modied by root-canal refac-
torings, and we veried if they contained smells that could indicate
architectural problems. We can focus our analysis on the root-canal
refactorings since those are those primarily targeted at improving
the system internal structure [
9
,
30
]. Thus, elements refactored
during this tactic have a high chance to contain architectural prob-
lems. Furthermore, we will be able to better understand the relation
among code refactoring, smells, and architectural problems – triple
relationship that has not been explored simultaneously in the lit-
erature.. As result, we expect to nd recurrent cases of when code
smells are indicator of architectural refactoring opportunities.
3.2 Investigation of Refactored Elements
We rely on actual code refactorings to nd the refactored elements
that had architectural problems. However, we cannot assume that all
refactored elements have architectural problems. Hence, we focused
on analyzing root-canal refactorings. Root-canal refactoring is a
tactic primarily applied to repair deteriorated code [
9
,
30
], which
can be caused by the presence of architectural problems. Thus,
elements refactored under this tactic have a high chance to contain
architectural problems. After nding these refactored elements, we
categorized them according to the number of smells:
Smell-free category
: This category encompasses refactored
elements that are NOT aected by smells.
Single smell category
: This category comprises refactored
elements aected by only ONE smell.
Multiple smells category
: This category includes refac-
tored elements aected by more than one smell.
We highlight that we are interested in architectural problems that
may be so harmful to the point of forcing developers to apply archi-
tectural refactoring. This is the reason why we analyzed elements
that developers focused their eort during root-canal refactoring.
They are the ones that may contain architectural problems that can
lead to the system redesign [12, 21, 43, 54].
4 DATA COLLECTION AND ANALYSIS
This section presents the phases followed to answer our RQ. The
replication package [
34
] contains the algorithm used (i) to identify
the refactored elements, (ii) to collect code smells, and (iii) to cate-
gorize the refactored elements according to the number of smells.
It also oers the entire dataset with the refactorings and smells
collected for the 50 GitHub projects.
4.1 Phase 1: Selection of Software Projects
The rst phase consisted of choosing open source projects. We
focused on open source projects to allow the study replication. As
GitHub is the world’s largest open source community, we estab-
lished it to be the source of software projects, in which we selected
projects that matched the following criteria:
Projects that have been evaluated with dierent levels of
popularity. As GitHub star is a metric to keep track of how
popular a project is among GitHub users, we used it to select
projects with dierent popularity levels;
Projects with an active issue tracking system.
Projects with at least 90% of code written in Java language.
These criteria allowed us to select 50 software projects that
are active and has been used in diverse contexts by the software
community. We focused on Java projects because Java is a very
popular programming language, and it was also targeted by related
studies [
2
,
4
]. Furthermore, we also selected projects in Java due to
the availability of tools to identify refactorings [
51
] and code smells
[
32
]. Thus, we selected Java projects with a diversity of structure,
size and popularity. The replication package [
34
] contains their
detailed information.
4.2 Phase 2: Code Refactoring Detection
The second phase consisted of detecting code refactorings for all
selected projects. We chose Refactoring Miner [
51
,
52
] (version
0.2.0) as the tool to detect code refactorings. This tool implements
a lightweight version of UMLDi algorithm [
57
] for dierencing
object-oriented models. When the tool is applied between two
versions, it returns the elements that changed from one version
to the other. It also returns the refactoring type associated to the
change. The reported precision of 98% [
40
,
52
] led to a very low
rate of false positives, as conrmed in our validation phase. The
tool detects the 13 refactoring types used in our study (Section 2.2).
After the code refactoring detection, we divided the refactor-
ings according to the applied tactic. We computed the number of
refactorings belonged to the oss or root-canal refactoring based
on the output of the Refactoring Miner and the eGit plugin (http:
//www.eclipse.org/egit/). We ran the eGit plugin to provide us the
changes within les between commits. Then, we used the UNIX di
tool to analyze all the changes in the classes modied by the refac-
torings. A code refactoring is considered root-canal if (and only
if) no other non-refactoring operation occurs in the same change
(i.e., same commit). For instance, we do not consider as root-canal
refactoring a Move Method detected in a commit together with one
or more other le modications that were not detected as code
refactorings. We performed this detection by crossing the list of
changes of each le in each commit with the changes identied as
code refactorings by the Refactoring Miner tool. From the total of
51,227 refactorings, 76.56% were classied as oss refactoring and
23.44% were classied as root-canal. This distribution is similar to
the one reported in a previous study [30].
4.3 Phase 3: Code Smell Detection
For this study, we decided to detect code smells with metrics-based
strategies [
18
,
25
]. Thus, we can compare our results with related
When Are Smells Indicators of Architectural Refactoring Opportunities? A Study of 50 Soware Projects ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea
studies that used the same detection strategies [
2
,
4
]. These strate-
gies are based on a set of detection rules [
2
,
18
] that compare metric
values with predened thresholds according to logical operators.
Since we did not nd any publicly available tool that implements
these detection rules, we developed the Organic tool [
32
]. Our
tool detects the following smells: Brain Class,Brain Method,Class
Data Should Be Private,Complex Class,Data Class,Dispersed Cou-
pling,Feature Envy,God Class,Intensive Coupling,Lazy Class,Long
Method,Long Parameter List,Message Chain,Refused Bequest,Shot-
gun Surgery,Spaghetti Code,Speculative Generality. Metrics, rules
and thresholds for the 17 code smells implemented in our tool
are the ones used in the literature [
2
,
18
] and are all available in
our replication package [
34
]. These smells are closely related to
architectural problems [9, 22, 24, 28, 29, 31, 55].
4.4 Phase 4: Manual Validation of Refactorings
The last phase comprises the validation of the code refactorings.
Even though Refactoring Miner achieved a precision of 98% [
52
],
we were not sure if it would achieve the same precision in our set of
software systems. For the validation, we conducted two inspections.
The rst inspection was to validate each one of the 13 refactoring
types (Section 2.2). For this inspection, we randomly sampled refac-
torings of each type since the precision of the Refactoring Miner
could vary from one type to another. Such variation is due to the
rules implemented in the tool to detect each refactoring type. To
deliver an acceptable condence level to the results, we calculated
the sample size of each refactoring type based on a condence level
of 95% and a condence interval of 5 points. For this inspection, we
recruited ten students to validate the samples manually. The manual
inspection started by presenting to the students a pair of versions of
elements marked as refactored by Refactoring Miner. For each pair
of elements, the student had to mark it as a valid refactoring or not.
We highlight that our goal was to ensure the trustability of the tool
for our set of software systems. This is the reason why we relied
on the students’ inspection, who were familiar with refactoring.
In general, we observed a high precision for each refactoring type,
with a median of 88.36%. The precision found in all refactoring
types is within one standard deviation (7.73). Applying the Grubb
outlier test (alpha=0.05), we could not nd any outlier, indicating
that no refactoring type is strongly inuencing the median preci-
sion found. Thus, the results found in the representative sample
represents a key factor to provide trustability to our results.
The second manual inspection was to validate the classication
of the refactorings into tactics. We conducted three steps. First,
we used Eclipse and the eGit plugin to classify a refactoring as
root-canal refactoring or oss refactoring. Second, we used the
UNIX di tool to analyze all the changes in the classes modied
by the refactoring operations. Third, we analyzed the tool output,
searching for a behavioral change. When nding one, we lled a
form explaining it, and we classied the change as oss refactoring.
When we did not nd a behavioral change, we classied it as root
canal refactoring. This second validation was conducted by three
researchers from our group given their expertise in refactoring.
Each code refactoring was manually classied by two researchers.
In a case of a disagreement, the third researcher stepped in. As
result, we classied 4,991 refactorings into root-canal and oss
refactoring. We found that developers apply root-canal refactoring
in 31.5% of the cases. The condence level for this number is 95%
with a condence interval of 5%. For this inspection, we did not
use students; instead, we (paper’s authors) relied on our expertise
to conduct the inspection. We decided to conduct this inspection
among us for a couple of reasons. First, we needed people who had
experience with refactoring before. To determine that there was a
behavior change due to a refactoring, one needs to have knowledge
that students may not have. Second, this second inspection requires
more eort and time than to validate the refactoring types.
5 RESULTS
We discuss here the results to our RQ. First, we analyzed whether
the refactored elements are associated with the absence or presence
of smells (Section 5.1). Second, we investigate when smells can indi-
cate architectural refactoring opportunities in general (Section 5.2).
Then, we further discuss dierent examples of when smells are
(or are not) indicators of architectural refactoring opportunities
(Section 5.3). Finally, we used these examples to nd recurring types
of smells that may indicate architectural refactoring opportunities
for specic types of architectural problems (Section 5.4).
5.1 Categorization of Refactored Elements
Table 2 provides a summary of the relationships between the refator-
ings, and the absence or presence of smells. The table presents the
total number of refactorings (1
st column
), while the next columns
show how often the refactored elements contain none smells, (2
nd col
umn
), only a single smell (3
r d column
) or multiple smells (4
th column
).
Table 2: Categorization of Refactored Elements
Refactorings Smell-free
Category
Single Smell
Category
Multiple Smells
Category
51,227 10,512 (20.52%) 16,443 (32.10%) 24,272 (47.38%)
Most refactored elements are smelly.
If we have found that
most refactorings belong to the smell-free category, we could con-
clude that smells are often no indicators of architectural refactoring.
However, we found exactly the opposite result. Only 20.52% of the
code refactorings (10,512) were applied to elements without smells.
From the 51,227 code refactorings, 40,715 (79.48%) were applied
to elements with at least one smell. From this result, 47.38% were
applied to 24,953 elements with more than one smell, and 32.10%
were applied to 16,906 elements with one smell.
One could argue that most code refactorings were applied to
smelly elements because most elements contain smells. In other
words, the refactored elements contain smells because their soft-
ware system has a high rate of smells, thereby increasing the likeli-
hood that code refactorings are applied to smelly elements. To verify
if most elements contain smells, we computed the probability of ran-
domly choosing a smelly element in our dataset (
|smellyelements |/|
all elements |
), which is 0.3%. This result shows that, in our dataset,
code refactorings are not applied to smelly elements by coincidence
because only 0.3% of the elements contain smells.
ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
5.2 Analysis of Root-canal Refactoring
Table 3 presents a summary of all collected code refactorings. It is
structured in terms of two major columns. They indicate the two
samples we analyzed. The rst one provides information about all
collected code refactorings. The second column contains details
about the code refactorings manually classied as root-canal. The
rows present the refactoring types (3
r d subcolumn
), which are di-
vided according to their category (1
st subcolumn
) and the kind of
activity that they represent to the program (2
nd subcolumn
). The
category (1
st subcolumn
) indicates if a refactoring aects the archi-
tectural structure of: (i) elements within a class hierarchy (Within
Hierarchy), or (ii) elements across multiple hierarchies (Across Hier-
archies). The activities (2
nd subcolumn
) indicate the mechanism be-
hind each refactoring type: (i) extraction of statements of a method,
(i) moving up or down methods and attribute within a hierarchy, (iii)
moving members, or (iv) restructuring modules across hierarchies.
Root-canal refactoring is most applied to smelly elements.
As not all refactored elements may contain architectural problems,
we analyzed elements refactored through root-canal refactoring
[
30
]. The 7
th subcolumn
of Table III shows the 1,168 refactorings
manually validated as root canal (Section 4.4). We classied these
root-canal refactorings according to our three categories: smell-free,
single smell and multiple smells (subcolumns 8-10). This classica-
tion shows that most code refactorings (88.53%) were applied to
smelly elements, either with a single smell (21.23%) or with multiple
smells (67.30%). As likelihood of these elements to contain architec-
tural problems is high, this result indicates that developers tend to
prioritize architectural problems that manifest as multiple smells
in the source code rather than isolated smells. We only found a
low percentage (11.47%) of root-canal refactorings (8th sub-column)
aecting smell-free program elements (smell-free category). We no-
ticed that these code refactorings are those often involving: (i) very
simple classes being moved (but without internal smells on them)
across packages, (ii) one or more renames, or (iii) method inlining.
Smells as Indicators of architectural refactoring opportu-
nities.
When we analyzed only the smelly elements, most code
refactorings were applied to elements with multiple smells (67.30%).
This number is higher than the overall code refactorings applied to
the multiple smells category (47.38%) shown in Table 2. In summary,
when developers focus on repairing deteriorated code, most code
refactorings are applied to elements that contain multiple smells.
As these code refactorings occurred in the context of root-canal
refactoring, these elements are very likely to contain architectural
problems. Thus, we can consider that smells are likely to indicate
architectural refactoring to remove architectural problems. This
result is summarized in our rst nding:
Finding 1
: In general, code smells can be used to indicate
architectural refactoring to remove architectural problems
when developers intend to repair their deteriorated code.
5.3 Relating Code Refactoring, Smells and
Architectural Problems
When we analyze each refactoring type in Table 3, we noticed that
most types are applied to elements with multiple smells. Only ve
refactoring types did not follow this distribution (gray rows). Extract
Interface,Extract Superclass, and Rename Method were frequently
applied to single smell elements in 51.85%, 62.16% and 55.13% of the
cases, respectively. On the other hand, Move Class was frequently
applied to elements without smells (51.66%). These four types pro-
vide us with interesting discussions. The Rename Class does not
provide discussions due to its low number of operations.
Across hierarchy refactorings are often applied to single
smells.
Extract Interface,Extract Superclass and Move Class belong
to the Across Hierarchy category. These refactoring types comprise
the activity of moving (part of) modules across dierent hierarchies.
Extract Interface and Extract Superclass are usually architectural
refactoring. For instance, they can be used to solve architectural
problems such as Ambiguous Interface, Cyclic Dependency, Incom-
plete Abstraction, and Unused Abstraction. These problems are
related to an inappropriate architectural decision that a stakeholder
made when designing the abstraction for code elements. Some of
these architectural problems can be identied with only a single
smell. Thus, it is not surprising that most of Extract Interface and
Extract Superclass were applied to the single smell category.
Extract Interface refactorings applied to Lazy Classes.
Af-
ter analyzing the smells touched by Extract Interface, we found
that most them (46%) were Lazy Class. At rst, there seems to be
no logical relationship of Extract Interface with Lazy Class. How-
ever, some Lazy Class occurred in interfaces and in abstract classes.
In these cases, Extract Interface is considered as an architectural
refactoring. In fact, we found that this architectural refactoring
was applied to meet the Interface Segregation Principle (ISP) [
26
].
This principle is often followed to remove Ambiguous Interface
and Fat Interface. Nevertheless, Lazy Class alone does not suce to
indicate architectural refactoring opportunities. The reason is that
even well designed abstractions may be aected by this smell type.
Extract Superclass applied in complex implementations.
Analyzing the smells touched by Extract Superclass, we observed
that 37% were either Complex Class or God Class.Extract Super-
class applied to classes with these two smells were often related
to the implementation of architectural patterns such as Strategy
and Command. Such refactorings are directly related to removing
architectural problems, which make them architectural refactor-
ings. For example, we observed Extract Superclass being applied to
create super classes in hierarchies related to parsing (Elasticsearch)
and command/request handling (Apache Coyote and Spring Frame-
work). Complex Class and God Class are often associated with dier-
ent types of architectural problem [
1
,
36
,
58
]. However, we found
that the presence of any of them alone is usually not sucient to
indicate a architectural problem. We will discuss how such smells
may be indicators of architectural refactoring opportunities when
they are combined with other smells (Section 5.4).
Extract Superclass applied in lazy implementations.
We
found a high proportion (35%) of Lazy Class and Data Class. These
smells were mostly observed when Extract Superclass aimed at
improving polymorphism. For instance, we observed an Extract
Superclass refactoring in the ArgoUML system, which involved 53
Lazy Classes that represent elements of the UML metamodel. Such
code refactoring introduced a new abstraction that helped to create
more reusable implementations. Even though the refactorings im-
proved the reusability, most classes had no architectural problems.
The examples of Extract Superclass led us to our second nding:
When Are Smells Indicators of Architectural Refactoring Opportunities? A Study of 50 Soware Projects ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea
Table 3: Categorization of Refactorings According to Root-canal Refactoring
All Collected Refactorings Validated Root-canal Refactorings
Category Activity Refactorting Type Total Floss
Refactoring
Root-canal
Refactoring Operations Smell-free Single
Smell
Multiple
Smells
Extract Method 15,629 12,084 (77.32%) 3,545 (22.68%) 345 3.77% 15.94% 80.29%
Extraction Inline Method 4,979 4,589 (92.17%) 390 (7.83%) 38 15.59% 13.16% 71.05%
Pull Up Method 4,610 2,585 (56.07%) 2,025 (43.93%) 197 3.05% 3.05% 93.91%
Pull Up Attribute 3,495 2,097 (60.00%) 1,398 (40.00%) 136 2.94% 25.00% 72.06%
Pull Down Method 575 482 (83.83%) 93 (16.17%) 9 0% 44.44% 55.56%
Within
Hierarchy Up/Down
Moves
Pull Down Attribute 246 133 (54.07%) 113 (45.93%) 11 9.09% 36.36% 54.55%
Move Method 5,000 3,900 (78.00%) 1,100 (22.00%) 107 2.80% 21.50% 75.70%Member
Moves Move Attribute 5,134 4,826 (94.00%) 308 (6.00%) 30 10.00% 33.33% 56.67%
Extract Inteface 614 336 (54.72%) 278 (45.28%) 27 7.41% 51.85% 40.74%
Extract Superclass 2,540 2,159 (85.00%) 381 (15.00%) 37 8.11% 62.16% 29.73%
Across
Hierarchies Module
Restructuring Move Class 2,042 490 (24.00%) 1,552 (76.00%) 151 51.66% 17.22% 31.13%
Rename Class 1,350 1,329 (98.44%) 21 (1.56%) 2 0% 50.00% 50.00%
Renaming Rename Method 5,013 4,211 (84.00%) 802 (16.00%) 78 19.23% 55.13% 25.64%
Total 51,227 39,221 (76.56%) 12,006 (23.44%) 1,168 11.47% 21.23% 67.30%
Finding 2
:Complex Class and God Class usually indicate
architectural refactoring opportunities when they occur to-
gether with other smells while Lazy Class and Data Class are
not.
Code smells are not able to indicate the need for Move
Class.
Interesting enough, Move Class was frequently applied to
the smell-free category. However, Move Class is a refactoring that is
closely related to architectural problems. This architectural refac-
toring can be applied to remove architectural problems such as
Scattered Concern and Component Overload. As we will discuss in
Section 5.4, Scattered Concerns are often indicated by smells such
as Dispersed Coupling,Divergent Change,Feature Envy,God Class,
Intensive Coupling, and Shotgun Surgery. However, by analyzing
the Move Class refactorings, we never observed such smells in the
moved classes. The reason is that a Move Class refactoring is unable
to remove architectural problems related to most of these smells.
Move Class is adequate to tackle architectural problems related to
high coupling and low cohesion at the component level. Smells
such as God Class and Intensive Coupling are best suited to indicate
these problems at the code element level, which are often removed
by other refactoring types, such as Extract Class and Move Method.
To verify our reasoning above, two researchers manually inves-
tigated the Move Class refactorings applied to smelly elements. We
found that most of the smells touched by Move Class refactorings
were either Data Class (21%) or Lazy Class (46%). Classes aected
by those smells were mostly Data Transfer Objects (DTOs) and
persistence entities. We also found many cases of smells such as
Complex Class (12%), Speculative Generality (6%), and Spaghetti Code
(4%) touched by Move Classes. However, those smells were never
removed by the Move Class refactorings. Thus, we conjecture that
many of these code refactorings were applied to organize classes
into sub-packages that best reect the system domain. Neverthe-
less, this intention has nothing to do with the detected smells. The
analysis of the Move Class led us to our third nding:
Finding 3
: Smells are not indicator of the Move Class archi-
tectural refactoring.
Rename Method as a complementary refactoring.
Both re-
name refactoring types did not follow the distribution of most
refactorings types. In the case of Rename Class, we cannot say
much since there were only two refactorings. On the other hand,
we can discuss the Rename Method, which it was most applied to
elements with only one smell. This refactoring type is not exactly
one directly related to an architectural problem. It is most likely that
is applied as part of a series of code refactorings. Thus, the Rename
Method can be applied in methods that contain a smell that points
out that the method name no longer makes sense. For instance,
61% of smells touched by Rename Method were Feature Envy. This
smell indicates that part of the method is more interested in another
class; thus, the method may be implementing two functionalities.
A developer can apply refactoring to move parts of the method to
another class. If s/he does it, the method may have to change the
name to be consistent with the functionality left behind. Then, s/he
applies the Rename Method before moving part of the functionality.
5.4 Analysis of Smell Patterns
Our previous analysis of refactored elements led to various discus-
sions on when smells are indicators of architectural refactoring
opportunities. We manually investigated 189 instances to support
these discussions. We used them to nd the refactorings to remove
each type of architectural problem. Based on this analysis and re-
lying on the literature [
9
,
22
,
24
,
28
,
29
,
31
,
55
], we searched for
patterns of smells and architectural problems. As result, we associ-
ated some types of smells with some types of architectural problems.
For instance, we noticed that usually when Dispersed Coupling,Fea-
ture Envy, and Long Method appeared together in an element, there
was a high likelihood that the element had the Misplaced Concern
problem. Based on our analysis, we dened a set of smell patterns,
which are shown in Table 4. A
smell pattern
represents one or
more types of smells (2
nd column
) that are likely to indicate an ar-
chitectural problem (1
st column
) if they appear in the code elements.
Consequently, if these smells appear together, then they indicate an
architectural refactoring opportunity. Thus, the developer should
use the recommended refactoring (Table 5) to get rid of the smell,
and, therefore, remove the architectural problem.
ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
Table 4: Smell Patterns to Indicate Architectural Problems
Architectural Problem Code Smells
Ambiguous Interface
(Long Method and Feature Envy in the interface
and Dispersed Coupling in elements that are
clients or implement the interface)
Cyclic Dependency (Intensive Coupling and Shotgun Surgery)
Component Overload
(Shotgun Surgery, Divergent Change, Feature
Envy, God Class/Complex Class, Intensive Coupling,
and Long Method)
Concern Overload
(Complex Class, Divergent Change, Feature Envy,
God Class/Complex Class, Intensive Coupling,
Long Method, and Shotgun Surgery)
Fat Interface
(Shotgun Surgery in the interface) or (Divergent
Change, Dispersed Coupling, and Feature Envy in
elements that are clients or implement the interface
Incomplete Abstraction (Lazy Class)
Misplaced Concern (God Class/Complex Class or Dispersed Coupling, Feature
Envy, and Long Method)
Scattered Concern
(Dispersed Coupling, Divergent Change,
Feature Envy, God Class/Complex Class, Intensive Coupling,
and Shotgun Surgery)
Unused Abstraction (Speculative Generality)
Unwanted Dependency (Feature Envy, Long Method, and Shotgun Surgery)
Table 5: Recommended Refactoring for the Code Smell
Code Smell Common Refactorings
Complex Class Extract Method, Move Method, Extract Class [2]
Dispersed Coupling Extract Method
Divergent Change Extract Class [9]
Feature Envy Move Method, Move Field, Extract Field [9]
God Class Extract Class, Move Method, Move Field [2]
Invensive Coupling Move Method, Extract Method
Lazy Class Inline Class, Collapse Hierarchy [9]
Long Method Extract Method [9]
Shotgun Surgery Move Method, Move Field, Inline Class [9]
Speculative Generality Collapse Hierarchy, Inline Class,
Remove Parameter, Rename Method [9]
Patterns with one and multiple smells.
According to these
patterns, some architectural problems that can be identied by only
a single smell, such as Incomplete Abstraction and Unused Abstrac-
tion, while there are other architectural problems that are most
suitable to be identied by multiple code smells, such as Ambiguous
Interface, Cyclic Dependency, Scattered Concern and Unwanted
Dependency. Finally, there are architectural problems that can be
identied by a single or multiple smells, such as Concern Over-
load and Fat Interface. Table 6 shows the number of architectural
problems that we found in our database according to these patterns.
Table 6: Architectural Problems in Refactored Elements
Architectural Problems Single Smell Multiple Code Smells
Ambiguous Interface N/A 0
Ciclic Dependency N/A 31
Concern Overload 490 514
Fat Interface 0 2
Incomplete Abstraction 12,503 N/A
Scattered Concern N/A 26
Unused Abstraction 2,317 N/A
Unwanted Dependency N/A 11
N/A = Not Applicable
Smells that indicate architectural refactoring for Concern
Overload.
The two architectural problems in gray rows provide
us with interesting discussions. We found 1,004 classes that could
have Concern Overload architectural problem according to the
smell patterns. This problem can be identied with the pattern
Complex Class/God Class (490 instances) or with multiple smells:
Complex Class,Divergent Change,Feature Envy,God Class,Intensive
Coupling,Long Method, and Shotgun Surgery (514 instances). We
randomly sampled 100 classes: half with the Complex Class/God
Class pattern and the other half with the pattern with multiple
smells. Our manual validation shows that the pattern with multiple
smells is most likely to indicate architectural refactoring to remove
the problem. From the classes that had only Complex Class or God
Class, 26% of them had the Concern Overload (13 classes). From
the classes that had the pattern with multiple smells, 64% had the
architectural problem (32 classes).
We also investigated the relation of the Concern Overload with
Move Method and Move Attribute. We selected these refactoring
types because they are often architectural refactorings applied for
removing occurrences of Concern Overload. We found that 69%
of Move Methods and 30% of Move Attributes occurred in elements
aected by Complex Class,God Class or by their combination. The
high percentage of Move Methods and Move Attributes that touch
the Complex Class/God Class patterns is another evidence that such
patterns are indicators of architectural refactoring opportunities.
Smells that indicate architectural refactoring for Fat In-
terface.
Fat Interface is another architectural problem that can be
identied through one or multiple smells. However, we did not nd
any interface in the systems that had the pattern with one smell
(Shotgun Surgery). On the other hand, we found 2 instances of Fat
Interface when we searched for elements that had the pattern with
multiple smells (Divergent Change,Dispersed Coupling, and Feature
Envy). Even though there were only two instances of Fat Interface,
these are two more cases which show that the pattern with multiple
smells is more likely to indicate an architectural refactoring for
architectural problem than the pattern with only a single smell.
Architectural Refactoring of latent architectural problems.
According to the smell patterns, to identify Ambiguous Interface
architectural problem, a developer needs to nd code elements that
use the interface and contain Long Method,Feature Envy, and Dis-
persed Coupling. We did not nd elements that are connected to
the interface and contain this smell pattern. However, when we
considered only Long Method and Feature Envy, then the number
increases to 130 possible instances of architectural problems. This
result is interesting to show that these elements were presenting
the rst signs of an architectural problem. Even if the developers’
intention was not to remove an architectural problem, this result
indicates that they refactored elements that had a sign of a latent
architectural problem, and these elements had code smells that
could indicate a (potential) architectural problem.
Using multiple smells.
The results about Concern Overload,
Fat Interface, and Ambiguous Interface are useful to discuss to what
extent smells help developers to nd architectural refactoring op-
portunities. According to the validation of these 102 instances of
architectural problems, relying on a single smell may not be enough
to identify architectural refactoring for some architectural prob-
lems. Our data suggest that if developers rely on smell patterns with
When Are Smells Indicators of Architectural Refactoring Opportunities? A Study of 50 Soware Projects ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea
multiple smells, they have a higher chance of identifying architec-
tural problems, which is consistent with results from other studies
[1, 23, 31, 55, 59]. This observation leads to our fourth nding:
Finding 4
: Patterns with multiple smells are more likely to
indicate architectural refactoring opportunities for Concern
Overload, Fat Interface, and Ambiguous Interface than pat-
terns with only one smell.
There are some pros and cons in using these smell patterns.
Analysis of various smells in some of these patterns may increase
developers’ condence on applying architectural refactorings. On
the other hand, to reason about multiple smells simultaneously may
be a hard task. Additionally, we highlight that developers cannot
identify an architectural refactoring opportunity if they expect
to nd all instances of the smells within a pattern. For example,
a developer would not identify any architectural refactoring to
remove Ambiguous Interface in our dataset if s/he was expecting to
nd instances of Long Method,Feature Envy, and Dispersed Coupling.
As mentioned, we did not nd elements connected to the interface
with all the smells in this pattern. Thus, developers cannot expect to
nd all the smells of a pattern to identify an architectural refactoring
opportunity, especially if the aected elements present the rst
signs of a problem. When we analyzed the relation of Ambiguous
Interface with Extract Method, for example, we observed 79% of
architectural refactorings occurring in elements aected either by
Long Method or Feature Envy. However, if we consider only when
both smells appear together, this percentage drops to 22%.
6 THREATS TO VALIDITY
We focused on refactored elements to identify elements with archi-
tectural problems. Regarding this threat, we are aware that smells
and architectural problems can appear in non-refactored elements;
thus, we are missing these elements. Consequently, we could (er-
roneously) conclude that smells are not indicators of architectural
refactoring opportunities. Nevertheless, we found that code refac-
torings are not applied to smelly elements by coincidence. The
chance of randomly choosing a smelly element in our dataset is
only 0.3%. Thus, refactoring operations indeed tend to concentrate
on smelly elements. Therefore, the use of smells to identify architec-
tural refactoring was appropriate. Additionally, we highlight that
we are interested in refactored elements because they are elements
which developers focused their eort on. Hence, these elements
may contain architectural problems that were important enough to
be architecturally refactored. Additionally, we focused our analy-
sis on root-canal refactoring since elements refactored during this
tactic have the highest chance to contain architectural problems.
We had to manually validate a set of refactoring types. We re-
lied on the students to perform this validation, which represents
a threat. In order to mitigate this threat, each refactoring instance
was validated by at least two students. In case there was a diver-
gence, one of the authors worked as third reviewer. In the second
inspection, we carefully characterized the refactoring as root-canal
or oss refactoring, which is another threat to internal validity.
Notice that such analysis is limited to two versions of the source
code directly impacted by the refactoring, i.e., not considering all
versions in the repository. Moreover, the manual analysis only con-
siders the constraint of behavior preservation in the elements that
were actually aected by the refactoring transformations.
The smell detection rules and the thresholds can be a threat.
Thus, the results are sensitive to code smell detection rules. As
mentioned before, such rules are based on thresholds. The risk is
that dierent thresholds can lead to completely distinct results. We
decided to not validate the detected smells because our investiga-
tion was focused on automatically detected smells. Thus, a manual
validation of smells would not contribute to our goal. To mitigate
this threat, we used rules and thresholds previously validated by
other researchers [2, 18, 31].
We assume that the elements that had a root-canal refactoring
have a high chance of containing architectural problems. This may
pose a threat since developers may apply root-canal refactoring
and even so, they do not remove architectural problems. How-
ever, we focused our analysis on root-canal refactoring because
this is the tactic that developers intentionally focus their eort
in repairing the structural quality. Indeed, a study with 328 engi-
neers, where 83% of them were developers, shown that developers
tend to create branches to apply refactorings exclusively [
16
], i.e.,
these branches mainly contain root-canal refactoring. Therefore,
the chance of developers target architectural problem during root-
canal is higher than when developers apply oss refactoring. To
support this assumption, we manually validated all the examples
discussed throughout the paper.
We selected a set of 50 software projects to analyze. Thus, the
representativeness of these projects is a threat. We mitigate it by
establishing a systematic process to select projects. As a result, we
obtained relevant Java projects with a diversity of structure and
size metrics.
7 RELATED WORK
We discuss some studies that are closely related to our study.
7.1 Code Smells and Architectural Problems
Code smells have been a well-researched topic over the last decade
[
22
,
24
,
28
,
29
,
31
,
33
,
53
,
55
]. For example, Tufano et al. [
53
] inves-
tigated when developers introduce code smells in their software
projects, and under what circumstances the introduction occurs.
First, they mined over 0.5M commits, and then they manually ana-
lyzed 9,164 commits to identify when the smells were introduced.
Among the results, they found that refactorings can also introduce
smells. In fact, they found that smells are introduced in the system
as consequence of maintenance and evolution activities.
Some studies investigated the developers’ perception of code
smells [
36
,
53
,
58
]. Yamashita and Moonen conducted an exploratory
survey with developers about their knowledge and concern with
code smells [
58
]. The authors investigated through developers’
perspective if code smells should be considered meaningful concep-
tualization of architectural problems. They applied a survey with
73 software developers. Based on the survey answers, they were
able to identify the smells that developers perceived as critical, why
they are critical, and what features a smell detection tool should
have. The results indicated that only 18% of respondents (13 de-
velopers) had a good understanding of code smells. However, the
ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
majority of the developers (19 out of 50 developers who nished
the survey) are concerned about smells in their source code, while
14% (7 developers) were extremely concerned. Duplicated Code,God
Class, and Long Methods were the smells perceived as critical.
Palomba et al. analyzed if smells are perceived as problems [
36
].
They validated 12 dierent smells in three open source projects.
Next, they showed developers code snippets aected and not af-
fected by these smells. Then, developers answered if they considered
the code snippets as actual problems. If so, they asked developers
to explain what type of problems they perceived. They reported
that most code smells are, in general, not perceived by developers
as actual problems. However, there are some code smells (Complex
Class,God Class,Long Method, and Spaghetti Code) that developers
immediately perceived as problems for the source code structure.
Even though, these studies provided few examples of smells as
indicators of architectural problems. Our results are consistent with
these studies. We showed that God Class and Complex Class can
indicate architectural problems, especially if they occur with other
smell types. However, we further explain when these smells are
related to the architectural problems. Dierent from these studies,
our goal is to investigate if code smells can be used as indicators
of architectural refactoring opportunities. Yamashita and Moonen
focused on investigating to what extent developers had a theoreti-
cal knowledge of code smells; while Palomba et al. investigated if
developers perceive smells as actual problems.
7.2 Identication of Architectural Problems
Studies have investigated approaches to identify architectural prob-
lems [
27
,
56
]. Mo et al. [
27
] proposed a suite of hotspot patterns:
recurring architectural problems that lead to high maintenance
cost. They showed that these patterns might be the causes of bug-
proneness and change-proneness. Xiao et al. [
56
] introduced an
approach to identify and quantify architectural problems. The ap-
proach uses four patterns to show the correlation between archi-
tectural problems and the decrease of software quality.
Another study focused on investigating the identication of ar-
chitectural problems from the perspective of developers [
45
]. Sousa
et al. [
45
] proposed a theory to describe how architectural problem
identication happens in practice. Their theory explains factors
that inuence developers during the identication of architectural
problems. These studies focused on observing developers identify-
ing architectural problems, showing that developers rely on smells
and other indicators to identify these problems. However, they did
not investigate when smells can be used to indicate architectural
refactoring opportunities. In fact, our results indicate that there are
some cases that smells cannot indicate any architectural refactoring
to remove architectural problems.
7.3 Investigation of Architectural Refactoring
Few studies have investigated architectural refactorings [
17
,
19
,
39
,
41
,
61
]. For example, Kumar and Kumar [
17
] reported an archi-
tectural refactoring of a payment integration platform of a corpo-
rate banking organization. In this industrial experience report, the
authors presented the key drives that motivated the refactoring,
including architectural problems such as Concern Overload. As a
result, the refactorings led to signicant improvement in applica-
tion stability and throughput. Lin et al. [
19
] proposed an approach
to guide developers in applying architectural refactorings. In their
approach, the developers indicate the target architectural design,
and then the approach suggests stepwise code refactorings that will
change the source code to meet the target architecture.
Zimmermann [
61
] positioned architectural refactoring as a task-
centric technique for restructuring an existing architecture, allow-
ing him to focus on design rationale and related tasks instead. He
then “introduced a quality story template that identies potential
architectural smells and an architectural refactoring template that
lists the architectural decisions to be revisited as well as the design
and development tasks to be conducted when an architectural refac-
toring is applied.” Rizzi, Fontana and Roveda [
41
] proposed a tool
prototype to remove the Cyclic Dependency problem. Their tool
suggests the refactorings steps that the developer should follow to
remove the Cilic Dependency. Recently, Rachow [
39
] proposed a
research idea to develop a framework that (i) detects an architec-
tural problem, (ii) selects and prioritizes code refactorings, and (iii)
shows to developers the impact on the architecture. Although his
framework has not been implemented, the author indicates seven
architectural problems that the framework will provide support.
8 CONCLUSION
We conducted a study with 50 software projects to investigate when
smells are indicators of architectural refactoring opportunities. We
found that most refactored elements have at least one smell that
can indicate an architectural refactoring opportunity. From 52,667
code refactorings, 79.48% were applied to elements with at least one
smell while 47.38% were applied to elements with multiple smells.
Analyzing the root-canal refactorings, we found that developers
have a higher chance to identify an architectural refactoring to
remove an architectural problem when it is related to multiple
smells. As a result of our analyses, we came out with a set of smell
patterns that are very likely to indicate architectural refactoring
opportunities. We are the rst one to catalog these patterns.
We identied important implications for researchers and prac-
titioners. For example, we found that God Class,Complex Class,
Lazy Class and Data Class appear in classes that developers applied
Extract Superclass. However, only God Class and Complex Class are
indicators of architectural refactoring opportunities. Tool builders
can take advantage of this information to prioritize the recommen-
dation of architectural refactorings according to smell types. The
patterns with multiple smells can be used by developers to identify
and refactor architectural problems such as Concern Overload.
As future work, we intend to investigate the smell patterns.
Our goal is to explore them to help developers in automatically
identifying architectural refactoring opportunities and to guide
them to apply code refactorings to remove an architectural problem.
ACKNOWLEDGMENT
We want to thank the reviewers for their valuable suggestions. This
work is funded by CNPq (grants 434969/2018-4, 312149/2016-6),
CAPES (grant 175956), and FAPERJ (grant 22520-7/2016).
When Are Smells Indicators of Architectural Refactoring Opportunities? A Study of 50 Soware Projects ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea
REFERENCES
[1]
M Abbes, F Khomh, Y Gueheneuc, and G Antoniol. 2011. An Empirical Study
of the Impact of Two Antipatterns, Blob and Spaghetti Code, on Program Com-
prehension. In Proceedings of the 15th European Software Engineering Conference;
Oldenburg, Germany. 181–190.
[2]
Gabriele Bavota, Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto, and
Fabio Palomba. 2015. An experimental investigation on the innate relationship
between quality and refactoring. Journal of Systems and Software 107 (2015), 1 –
14. https://doi.org/10.1016/j.jss.2015.05.024
[3]
Timothy A. Budd. 2001. An Introduction to Object-Oriented Programming (3rd
ed.). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.
[4]
Diego Cedrim, Alessandro Garcia, Melina Mongiovi, Rohit Gheyi, Leonardo Sousa,
Rafael de Mello, Baldoino Fonseca, Márcio Ribeiro, and Alexander Chávez. 2017.
Understanding the Impact of Refactoring on Smells: A Longitudinal Study of 23
Software Projects. In Proceedings of the 2017 11th Joint Meeting on Foundations
of Software Engineering (ESEC/FSE 2017). ACM, New York, NY, USA, 465–475.
https://doi.org/10.1145/3106237.3106259
[5]
Alexander Chávez, Isabella Ferreira, Eduardo Fernandes, Diego Cedrim, and
Alessandro Garcia. 2017. How Does Refactoring Aect Internal Quality At-
tributes?: A Multi-project Study. In Proceedings of the 31st Brazilian Sympo-
sium on Software Engineering (SBES’17). ACM, New York, NY, USA, 74–83.
https://doi.org/10.1145/3131151.3131171
[6]
O. Ciupke. 1999. Automatic detection of design problems in object-oriented
reengineering. In Proceedings of Technology of Object-Oriented Languages and
Systems - TOOLS 30 (Cat. No.PR00278). 18–32.
[7]
Bill Curtis, Jay Sappidi, and Alexandra Szynkarski. 2012. Estimating the Size, Cost,
and Types of Technical Debt. In Proceedings of the Third International Workshop
on Managing Technical Debt (MTD ’12). IEEE Press, Piscataway, NJ, USA, 49–53.
http://dl.acm.org/citation.cfm?id=2666036.2666045
[8]
Neil A. Ernst, Stephany Bellomo, Ipek Ozkaya, Robert L. Nord, and Ian Gorton.
2015. Measure It? Manage It? Ignore It? Software Practitioners and Technical
Debt. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software
Engineering (ESEC/FSE 2015). Association for Computing Machinery, New York,
NY, USA, 50âĂŞ60. https://doi.org/10.1145/2786805.2786848
[9]
M Fowler. 1999. Refactoring: Improving the Design of Existing Code. Addison-
Wesley Professional, Boston.
[10] J Garcia, D Popescu, G Edwards, and N Medvidovic. 2009. Identifying Architec-
tural Bad Smells. In CSMR09; Kaiserslautern, Germany. IEEE.
[11]
Joshua Garcia, Daniel Popescu, George Edwards, and Nenad Medvidovic. 2009.
Toward a Catalogue of Architectural Bad Smells. In Architectures for Adaptive
Software Systems, Raaela Mirandola, Ian Gorton, and Christine Hofmeister (Eds.).
Springer Berlin Heidelberg, Berlin, Heidelberg, 146–162.
[12]
M Godfrey and E Lee. 2000. Secrets from the Monster: Extracting Mozilla’s
Software Architecture. In CoSET-00; Limerick, Ireland. 15–23.
[13]
Phil Greenwood, Thiago Bartolomei, Eduardo Figueiredo, Marcos Dosea, Alessan-
dro Garcia, Nelio Cacho, Cláudio Sant’Anna, Sergio Soares, Paulo Borba, Uirá
Kulesza, and Awais Rashid. 2007. On the Impact of Aspectual Decompositions
on Design Stability: An Empirical Study. In Proceedings of the 21st European
Conference on Object-Oriented Programming (ECOOP’07). Springer-Verlag, Berlin,
Heidelberg, 176–200. http://dl.acm.org/citation.cfm?id=2394758.2394771
[14]
P. Kaminski. 2007. Reforming Software Design Documentation. In 14th Working
Conference on Reverse Engineering (WCRE 2007). 277–280.
[15]
Miryung Kim, Thomas Zimmermann, and Nachiappan Nagappan. 2012. A Field
Study of Refactoring Challenges and Benets. In Proceedings of the 20th Interna-
tional Symposium on the Foundations of Software Engineering (FSE ’12). ACM, New
York, NY, USA, Article 50, 11 pages. https://doi.org/10.1145/2393596.2393655
[16]
Miryung Kim, Thomas Zimmermann, and Nachiappan Nagappan. 2014. An em-
pirical study of refactoringchallenges and benets at microsoft. IEEE Transactions
on Software Engineering 40, 7 (2014), 633–649.
[17]
M. Raveendra Kumar and R. Hari Kumar. 2011. Architectural Refactoring of a Mis-
sion Critical Integration Application: A Case Study. In Proceedings of the 4th India
Software Engineering Conference (ISEC âĂŹ11). Association for Computing Ma-
chinery, New York, NY, USA, 77âĂŞ83. https://doi.org/10.1145/1953355.1953365
[18]
M Lanza and R Marinescu. 2006. Object-Oriented Metrics in Practice. Springer,
Heidelberg.
[19]
Yun Lin, Xin Peng, Yuanfang Cai, Danny Dig, Diwen Zheng, and Wenyun Zhao.
2016. Interactive and Guided Architectural Refactoring with Search-Based Recom-
mendation. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium
on Foundations of Software Engineering (FSE 2016). Association for Computing Ma-
chinery, New York, NY, USA, 535âĂŞ546. https://doi.org/10.1145/2950290.2950317
[20]
M. Lippert and S. Roock. 2006. Refactoring in Large Software Projects: Performing
Complex Restructurings Successfully. Wiley. https://books.google.com.br/books?
id=bCEYuB83ROcC
[21]
A MacCormack, J Rusnak, and C Baldwin. 2006. Exploring the Structure of
Complex Software Designs: An Empirical Study of Open Source and Proprietary
Code. Manage. Sci. 52, 7 (2006), 1015–1030.
[22]
I Macia. 2013. On the Detection of Architecturally-Relevant Code Anomalies in
Software Systems. Ph.D. Dissertation. Pontical Catholic University of Rio de
Janeiro, Informatics Department.
[23]
I. Macia, R. Arcoverde, A. Garcia, C. Chavez, and A. von Staa. 2012. On the Rele-
vance of Code Anomalies for Identifying Architecture Degradation Symptoms.
In CSMR12. 277–286.
[24]
Isela Macia, Joshua Garcia, Daniel Popescu, Alessandro Garcia, Nenad Medvi-
dovic, and Arndt von Staa. 2012. Are Automatically-detected Code Anomalies
Relevant to Architectural Modularity?: An Exploratory Analysis of Evolving
Systems. In AOSD ’12. ACM, New York, NY, USA, 167–178.
[25]
Marinescu. 2004. Detection strategies: metrics-based rules for detecting design
aws. In Proceedings of 20th IEEE International Conference on Software Maintenance
(ICSM); Chicago, USA. 350–359.
[26]
Robert C. Martin and Micah Martin. 2006. Agile Principles, Patterns, and Practices
in C# (Robert C. Martin). Prentice Hall PTR, Upper Saddle River, NJ, USA.
[27]
Ran Mo, Yuanfang Cai, R. Kazman, and Lu Xiao. 2015. Hotspot Patterns: The
Formal Denition and Automatic Detection of Architecture Smells. In Software
Architecture (WICSA), 2015 12th Working IEEE/IFIP Conference on. 51–60.
[28]
N. Moha, Y. g. Gueheneuc, and P. Leduc. 2006. Automatic Generation of Detection
Algorithms for Design Defects. In 21st IEEE/ACM International Conference on
Automated Software Engineering (ASE’06). 297–300. https://doi.org/10.1109/ASE.
2006.22
[29]
N Moha, Y Gueheneuc, L Duchien, and A Le Meur. 2010. DECOR: A Method for
the Specication and Detection of Code and Design Smells. IEEE Transaction on
Software Engineering 36 (2010), 20–36.
[30]
Emerson Murphy-Hill, Chris Parnin, and Andrew P. Black. 2009. How We
Refactor, and How We Know It. In Proceedings of the 31st International Conference
on Software Engineering (ICSE ’09). IEEE Computer Society, Washington, DC,
USA, 287–297. https://doi.org/10.1109/ICSE.2009.5070529
[31]
W Oizumi, A Garcia, L Sousa, B Cafeo, and Y Zhao. 2016. Code Anomalies
Flock Together: Exploring Code Anomaly Agglomerations for Locating Design
Problems. In The 38th International Conference on Software Engineering; USA.
[32]
Willian Nalepa Oizumi, Leonardo da Silva Sousa, Anderson Oliveira, Alessandro
Garcia, O. I. Anne Benedicte Agbachi, Roberto Felicio Oliveira, and Carlos Lucena.
2018. On the identication of design problems in stinky code: experiences and
tool support. J. Braz. Comp. Soc. 24, 1 (2018), 13:1–13:30. https://doi.org/10.1186/
s13173-018- 0078-y
[33]
Anderson Oliveira, Leonardo Sousa, Willian Oizumi, and Alessandro Garcia. 2019.
On the Prioritization of Design-Relevant Smelly Elements: A Mixed-Method,
Multi-Project Study. In Proceedings of the XIII Brazilian Symposium on Software
Components, Architectures, and Reuse (SBCARS ’19). ACM, New York, NY, USA,
83–92. https://doi.org/10.1145/3357141.3357142
[34]
2020 Replication Package. 2020. https://gshare.com/s/3ac0da284f700d186dfa.
(2020).
[35]
Meilir Page-Jones. 2000. Fundamentals of Object-oriented Design in UML. Addison-
Wesley Longman Publishing Co., Inc., Boston, MA, USA.
[36]
F. Palomba, G. Bavota, M. D. Penta, R. Oliveto, and A. D. Lucia. 2014. Do They
Really Smell Bad? A Study on Developers’ Perception of Bad Code Smells. In 2014
IEEE International Conference on Software Maintenance and Evolution. 101–110.
https://doi.org/10.1109/ICSME.2014.32
[37]
David L. Parnas. 1978. Designing Software for Ease of Extension and Contraction.
In Proceedings of the 3rd International Conference on Software Engineering (ICSE
’78). IEEE Press, Piscataway, NJ, USA, 264–277.
[38]
Dewayne E. Perry and Alexander L. Wolf. 1992. Foundations for the Study
of Software Architecture. SIGSOFT Softw. Eng. Notes 17, 4 (Oct. 1992), 40–52.
https://doi.org/10.1145/141874.141884
[39]
P. Rachow. 2019. Refactoring Decision Support for Developers and Architects
Based on Architectural Impact. In 2019 IEEE International Conference on Software
Architecture Companion (ICSA-C). 262–266. https://doi.org/10.1109/ICSA-C.2019.
00054
[40]
Refactoring Oracle [n. d.]. Refactoring Oracle. ([n. d.]). http://refactoring.encs.
concordia.ca/oracle/
[41]
Luca Rizzi, Francesca Arcelli Fontana, and Riccardo Roveda. 2018. Support for
Architectural Smell Refactoring. In Proceedings of the 2nd International Workshop
on Refactoring (IWoR 2018). Association for Computing Machinery, New York,
NY, USA, 7âĂŞ10. https://doi.org/10.1145/3242163.3242165
[42]
Santonu Sarkar, Shubha Ramachandran, G. Sathish Kumar, Madhu K. Iyengar,
K. Rangarajan, and Saravanan Sivagnanam. 2009. Modularization of a Large-
Scale Business Application: A Case Study. IEEE Softw. 26, 2 (March 2009), 28–35.
https://doi.org/10.1109/MS.2009.42
[43]
S Schach, B Jin, D Wright, G Heller, and A Outt. 2002. Maintainability of the
Linux kernel. Software, IEE Proceedings - 149, 1 (2002), 18–23.
[44]
Marcelino Campos Oliveira Silva, Marco Tulio Valente, and Ricardo Terra. 2016.
Does Technical Debt Lead to the Rejection of Pull Requests?. In Proceedings of
the 12th Brazilian Symposium on Information Systems (SBSI ’16). 248–254.
[45]
Leonardo Sousa, Anderson Oliveira, Willian Oizumi, Simone Barbosa, Alessandro
Garcia, Jaejoon Lee, Marcos Kalinowski, Rafael de Mello, Baldoino Fonseca,
ICPC ’20, October 5–6, 2020, Seoul, Republic of Korea Sousa et al.
Roberto Oliveira, Carlos Lucena, and Rodrigo Paes. 2018. Identifying Design
Problems in the Source Code: A Grounded Theory. In Proceedings of the 40th
International Conference on Software Engineering (ICSE ’18). ACM, New York, NY,
USA, 921–931. https://doi.org/10.1145/3180155.3180239
[46]
Leonardo Sousa, Roberto Oliveira, Alessandro Garcia, Jaejoon Lee, Tayana Conte,
Willian Oizumi, Rafael de Mello, Adriana Lopes, Natasha Valentim,Edson Oliveira,
and Carlos Lucena. 2017. How Do Software Developers Identify Design Prob-
lems?: A Qualitative Analysis. In Proceedings of 31st Brazilian Symposium on
Software Engineering (SBES’17). 12.
[47]
Michael Stal. 2014. Chapter 3 - Refactoring Software Architectures. In Agile
Software Architecture, Muhammad Ali Babar, Alan W. Brown, and Ivan Mistrik
(Eds.). Morgan Kaufmann, Boston, 63 – 82. https://doi.org/10.1016/B978-0-12-
407772-0.00003- 4
[48]
Girish Suryanarayana, Ganesh Samarthyam, and Tushar Sharma. 2014. Refac-
toring for Software Design Smells: Managing Technical Debt (1st ed.). Morgan
Kaufmann Publishers Inc., San Francisco, CA, USA.
[49]
A. Trifu and R. Marinescu. 2005. Diagnosing design problems in object oriented
systems. In WCRE’05. 10 pp.
[50]
Adrian Trifu and Urs Reupke. 2007. Towards Automated Restructuring of Object
Oriented Systems. In CSMR ’07. IEEE, Washington, DC, USA, 39–48.
[51]
Nikolaos Tsantalis, Victor Guana, Eleni Stroulia, and Abram Hindle. 2013. A
multidimensional empirical study on refactoring activity. In Proceedings of the
2013 Conference of the Center for Advanced Studies on Collaborative Research. IBM
Corp., 132–146.
[52]
Nikolaos Tsantalis, Matin Mansouri, Laleh M. Eshkevari, Davood Mazinanian, and
Danny Dig. 2018. Accurate and Ecient Refactoring Detection in Commit History.
In Proceedings of the 40th International Conference on Software Engineering (ICSE
’18). ACM, New York, NY, USA, 483–494. https://doi.org/10.1145/3180155.3180206
[53]
Michele Tufano, Fabio Palomba, Gabriele Bavota, Rocco Oliveto, Massimiliano
Di Penta, Andrea De Lucia, and Denys Poshyvanyk. 2015. When and Why Your
Code Starts to Smell Bad. In Proceedings of the 37th International Conference on
Software Engineering (ICSE ’15). ACM, New York, NY, USA, 13.
[54]
J van Gurp and J Bosch. 2002. Design erosion: problems and causes. Journal of
Systems and Software 61, 2 (2002), 105 – 119.
[55]
S. Vidal, E. Guimaraes, W. Oizumi, A. Garcia, A. D. Pace, and C. Marcos. 2016.
Identifying Architectural Problems through Prioritization of Code Smells. In
SBCARS16. 41–50.
[56]
Lu Xiao, Yuanfang Cai, Rick Kazman, Ran Mo, and Qiong Feng. 2016. Identifying
and Quantifying Architectural Debt. In Proceedings of the 38th International
Conference on Software Engineering (ICSE ’16). ACM, New York, NY, USA, 488–
498. https://doi.org/10.1145/2884781.2884822
[57]
Zhenchang Xing and Eleni Stroulia. 2005. UMLDi: An Algorithm for Object-
oriented Design Dierencing. In Proc. of ASE ’05. 54–65. https://doi.org/10.1145/
1101908.1101919
[58]
A. Yamashita and L. Moonen. 2013. Do developers care about code smells?
An exploratory survey. In 2013 20th Working Conference on Reverse Engineering
(WCRE). 242–251. https://doi.org/10.1109/WCRE.2013.6671299
[59]
A Yamashita and L Moonen. 2013. Exploring the impact of inter-smell rela-
tions on software maintainability: an empirical study. In Proceedings of the 35th
International Conference on Software Engineering; San Francisco, USA. 682–691.
[60]
O. Zimmermann. 2015. Architectural Refactoring: A Task-Centric View on
Software Evolution. IEEE Software 32, 02 (mar 2015), 26–29. https://doi.org/10.
1109/MS.2015.37
[61]
Olaf Zimmermann. 2017. Architectural refactoring for the cloud: a decision-
centric view on cloud migration. Computing 99, 2 (01 Feb 2017), 129–145. https:
//doi.org/10.1007/s00607-016- 0520-y
... Root-canal refactoring is also typically described as correcting unhealthy code, emphasizing quality improvements rather than other motivations. Multiple empirical studies that use analysis of commit histories or IDE usage have found more evidence of floss refactoring than evidence of root-canal refactoring [19,24,32]. Murphy-Hill et al's study [24] further suggests that "studies should focus on floss refactoring for the greatest generality. ...
... The work of Lin et al. is closest in its motivation to the large-scale refactoring notion that our survey investigated [18], but their work is focused on developing a research tool that relies on recommendations of a limited number of refactorings. Other work in architecture refactoring mostly addresses code and architecture smell detection, which often focus narrowly on quality symptoms to hint at opportunities for architecture level changes [28,32]. Our study, in contrast, takes a broad perspective on the range of activities and supporting tools from the perspective of developers performing large-scale refactoring in industry. ...
Preprint
Full-text available
Software refactoring plays an important role in software engineering. Developers often turn to refactoring when they want to restructure software to improve its quality without changing its external behavior. Studies show that small-scale (floss) refactoring is common in industry and can often be performed by a single developer in short sessions, even though developers do much of this work manually instead of using refactoring tools. However, some refactoring efforts are much larger in scale, requiring entire teams and months of effort, and the role of tools in these efforts is not as well studied. In this paper, we report on a survey we conducted with developers to understand large-scale refactoring, its prevalence, and how tools support it. Our results from 107 industry developers demonstrate that projects commonly go through multiple large-scale refactorings, each of which requires considerable effort. While there is often a desire to refactor, other business concerns such as developing new features often take higher priority. Our study finds that developers use several categories of tools to support large-scale refactoring and rely more heavily on general-purpose tools like IDEs than on tools designed specifically to support refactoring. Tool support varies across the different activities, with some particularly challenging activities seeing little use of tools in practice. Our study demonstrates a clear need for better large-scale refactoring tools and an opportunity for refactoring researchers to make a difference in industry. The results we summarize in this paper is one concrete step towards this goal.
... Code smell is a surface indicator of deeper structural problems in the software system [11,24,29]. In fact, the presence of code smells can even indicate the need for a refactoring [11]. ...
Conference Paper
Full-text available
Structural degradation is the process in which quality attributes of a system are negatively impacted. When due attention is not paid to structural degradation, the source code may also become difficult to change. Code smells are recurring structures in the source code that may represent structural degradation. Hence, there are many catalogs and techniques for supporting the removal of code smells through refactoring recommendations, which usually consist of single refactorings such as a Move Method or an Extract Method. However, single refactorings are often not enough for completely removing certain smell occurrences. Moreover, recent studies show that developers most often apply composite refactorings - i.e., sequences of two or more refactorings - for removing code smells. Despite showing the importance of performing composite refactorings, most studies do not provide information on which composite refactoring patterns are recurrent in practice. In this context, a previous study identified 35 smell removal patterns that are frequent across multiple open source systems. However, such study has not explored how the removal patterns could help developers to apply effective composite refactorings. Thus, in this work, we propose a suite of new recommendation heuristics to help developers in applying effective composite refactorings. These heuristics are intended to remove three code smell types, namely Complex Class, Feature Envy, and God Class. After designing the heuristics, we evaluated their effectiveness through a quasi-experiment. This evaluation was conducted with 12 software developers and 9 smelly Java classes. Results indicate that developers considered our heuristics effective or partially effective in more than 93% of the cases. In addition, the evaluation helped us to identify multiple factors that contribute to the acceptance or rejection of the refactoring recommendations. Based on these factors, we defined new guidelines for the effective recommendation of smell-removal composite refactorings.
... Understanding the influence of intents on refactorings. Many existing techniques for recommending refactorings are designed to support developers in performing pure structural improvement with at least three goals: (i) pure refactoring of a specific module or the entire program [28], (ii) removing code smells [11,21,26,27,33,34], or (iii) re-architecting a system [14,24,31,42]. However, our study found that refactorings are less commonly applied in changes with the sole purpose of either optimizing or improving the low-level structure of a class (or one or more methods), its architecture or even a specific module. ...
Conference Paper
Full-text available
Code refactorings are of pivotal importance in modern code review. Developers may preserve, revisit, add or undo refactorings through changes' revisions. Their goal is to certify that the driving intent of a code change is properly achieved. Developers' intents behind refactorings may vary from pure structural improvement to facilitating feature additions and bug fixes. However, there is little understanding of the refactoring practices performed by developers during the code review process. It is also unclear whether the devel-opers' intents influence the selection, composition, and evolution of refactorings during the review of a code change. Through mining 1,780 reviewed code changes from 6 systems pertaining to two large open-source communities, we report the first in-depth empirical study on software refactoring during code review. We inspected and classified the developers' intents behind each code change into 7 distinct categories. By analyzing data generated during the complete reviewing process, we observe: (i) how refactorings are selected, composed and evolved throughout each code change, and (ii) how developers' intents are related to these decisions. For instance , our analysis shows developers regularly apply non-trivial sequences of refactorings that crosscut multiple code elements (i.e., widely scattered in the program) to support a single feature addition. Moreover, we observed that new developers' intents commonly emerge during the code review process, influencing how developers select and compose their refactorings to achieve the new and adapted goals. Finally, we provide an enriched dataset that allows researchers to investigate the context and motivations behind refactoring operations during the code review process. CCS CONCEPTS • Software and its engineering → Software maintenance tools; Software evolution.
Article
The JavaScript language did not specify, until ECMAScript 6 (ES6), native features for streamlining encapsulation and modularity. Developer community filled the gap with a proliferation of design patterns and module formats, with impact on code reusability, portability and complexity of build configurations. This work studies the automated refactoring of legacy ES5 code to ES6 modules with fine-grained reuse of module contents through the named import/export language constructs. The focus is on reducing the coupling of refactored modules through destructuring exported module objects to fine-grained module features and enhancing module dependencies by leveraging the ES6 syntax. We employ static analysis to construct a model of a JavaScript project, the Module Dependence Graph (MDG), that represents modules and their dependencies. On the basis of MDG we specify the refactoring procedure for module migration to ES6. A prototype implementation has been empirically evaluated on 19 open source projects. Results highlight the relevance of the refactoring with a developer intent for fine-grained reuse. The analysis of refactored code shows an increase in the number of reusable elements per project and reduction in the coupling of refactored modules. The soundness of the refactoring is empirically validated through code inspection and execution of projects’ test suites.
Conference Paper
Full-text available
Software systems are likely to face what is called design problems. Given the typical lack of design documentation, developers have to rely on implementation-level symptoms, the so-called code smells, to identify and remove design problems. A code smell is a micro-structure in the program that can indicate the presence of a design problem. Large programs have hundreds or thousands of program elements (e.g., classes) in which a significant proportion may be affected by smells. Consequently, due to time constraints and the large number of elements, developers have to prioritize the design-relevant program elements, i.e., locate a shortlist of elements that are suspects of having design-relevant smells. However, this task is hard and time-consuming. Unfortunately, the literature fails to provide developers with effective heuristics that automate such pri-oritization task. The objective of this paper is to propose heuristics that effectively locate a shortlist of design-relevant smelly program elements. For this purpose, we report two studies. In the first one, we investigated the criteria that developers used in practice to accurately prioritize design-relevant smelly elements. Based on these criteria, we derived a preliminary suite of prioritization heuristics. Since we do not know if the heuristics are suitable for an effective prioritization across multiple projects, we performed a second study to evaluate the proposed heuristics. We found that two out of nine heuristics reached an average precision higher than 75% for the four projects we analyzed. Thus, our findings suggest these heuristics are promising to support developers in prioritizing design-relevant smelly elements.
Article
Full-text available
Abstract Background Developers often have to locate design problems in the source code. Several types of design problems may manifest as code smells in the program. A code smell is a source code structure that may reveal a partial hint about the manifestation of a design problem. Recent studies suggest that developers should ignore smells occurring in isolation in a program location. Instead, they should focus on analyzing stinkier code, i.e., program locations—e.g., a class or a hierarchy—affected by multiple smells. There is evidence that the stinkier a program location is, the more likely it contains a design problem. However, there is no empirical evidence on whether developers can effectively identify a design problem in stinkier code. Developers may struggle to make an analysis of inter-related smells affecting the same program location. Besides that, the analysis of stinkier code may require proper tool support due to its analysis complexity. However, there is little knowledge on what are the requirements for a tool that helps developers in revealing stinkier program locations. As a result, developers may not be able to identify design problems due to tool issues. Method To address this matter, we aimed at achieving three goals. In the first case, we proposed Organic—a tool supporting the analysis of stinky code. In the second case, we applied a mixed-method approach to analyze if and how developers can effectively find design problems when reflecting upon stinky code—i.e., a program location affected by multiple smells. We conducted a study with 11 software professionals. Finally, in the third case, we aimed at understanding if Organic could be used by developers to identify design problems. To achieve this goal, we used a method from the Semiotic Engineering theory. This method enabled us to evaluate what are the tool issues that may hinder the identification of design problems in stinky code. Result Our study revealed that only 36.36% of the developers found more design problems when explicitly reasoning about multiple smells as compared to single smells. Moreover, 63.63% of the developers reported much lesser false positives when using the first approach as compared to the latter. The second study, in its turn, showed that most developers may be unable to identify design problems in stinky code without proper tool support. Conclusion Our experiences, in particular the second study, helped us to refine the features of Organic for better supporting developers in reflecting upon stinkier code. For example, analyses of stinky code scattered in class hierarchies or packages is often difficult, time-consuming, and requires proper visualization support. Moreover, without effective support, it remains time-consuming to discard stinky program locations that do not represent design problems.
Conference Paper
Full-text available
To preserve high quality of a project, it is necessary to perform many refactoring steps during the development and maintenance phases. Code refactoring received a great attention in the literature, often directed to the refactoring of code smells. While at the architectural level, architectural smells received less attention. One of the most common architectural smell which may affect many parts of a software application is the Cyclic Dependency smell. This position paper presents a tool prototype that suggests which path a developer could follow to remove Cyclic Dependency smells in Java applications. The tool has been developed as an extension of the Arcan tool for architectural smells detection. In this paper we describe how the tool identifies the path to follow for the refactoring of Cyclic Dependency on real projects. The tool has been thought only to suggest a possible refactoring approach which a developer could considerate to resolve Cyclic Dependency.
Conference Paper
Full-text available
The prevalence of design problems may cause re-engineering or even discontinuation of the system. Due to missing, informal or outdated design documentation, developers often have to rely on the source code to identify design problems. Therefore, developers have to analyze different symptoms that manifest in several code elements, which may quickly turn into a complex task. Although researchers have been investigating techniques to help developers in identifying design problems, there is little knowledge on how developers actually proceed to identify design problems. In order to tackle this problem, we conducted a multi-trial industrial experiment with professionals from 5 software companies to build a grounded theory. The resulting theory offers explanations on how developers identify design problems in practice. For instance, it reveals the characteristics of symptoms that developers consider helpful. Moreover, developers often combine different types of symptoms to identify a single design problem. This knowledge serves as a basis to further understand the phenomena and advance towards more effective identification techniques. CCS CONCEPTS • Software and its engineering → Software design engineering ; KEYWORDS design problem, grounded theory, software design, symptoms ACM Reference Format:
Conference Paper
Full-text available
When a software design decision has a negative impact on one or more quality attributes, we call it a design problem. For example, the Fat Interface problem indicates that an interface exposes non-cohesive services Thus, clients and implementations of this interface may have to handle with services that they are not interested. A design problem such as this hampers the extensibility and maintainability of a software system. As illustrated by the example, a single design problem often affects several elements in the program. Despite its harmfulness, it is difficult to identify a design problem in a system. It is even more challenging to identify design problems when the source code is the only available artifact. In particular, no study has observed what strategy(ies) developers use in practice to identify design problems when the design documentation is unavailable. In order to address this gap, we conducted a qualitative analysis on how developers identify design problems in two different scenarios: when they are either familiar (Scenario 1) or unfamiliar (Scenario 2) with the analyzed systems. Developers familiar with the systems applied a diverse set of strategies during the identification of each design problem. Some strategies were frequently used to locate code elements for analysis, and other strategies were frequently used to confirm design problems in these elements. Developers unfamiliar with the systems relied only on the use of code smells along the task. Despite some differences among the subjects from both scenarios, we noticed that developers often search for multiple indicators during the identification of each design problem.
Conference Paper
Full-text available
Refactoring is a technique commonly applied by developers along the software maintenance and evolution. Software refactoring is expected to improve the internal quality attributes of a software project, such as coupling and cohesion. However, there is limited understanding on to what extent developers achieve this expectation when refactoring their source code. This study investigates how refactoring operations affect five well-known internal quality attributes: cohesion, coupling, complexity, inheritance, and size. For this purpose, we analyze the version history of 23 open source projects with 29,303 refactoring operations. Our analysis revealed interesting observations. First, we noticed that developers apply more than 94% of the refactoring operations to code elements with at least one critical internal quality attribute, as opposed to previous work. Second, 65% of the refactoring operations improve their related internal quality attributes and the remaining 35% operations keep the quality attributes unaffected. Third, whenever pure refactoring operations are applied (the so-called root-canal refactoring), we confirm that internal quality attributes are either frequently improved or at least not worsened. Finally, while refactoring operations often reach other specific aims, such as adding a new feature or fixing a bug, 55% of these operations surprisingly improve internal quality attributes against only 10% of quality decline.
Chapter
Technical debt is the debt that accrues when developers knowingly or unknowingly make wrong or nonoptimal design decisions. Technical debt is akin to financial debt. Allowing a small debt to occur is acceptable provided it is paid soon enough; not paying the debt for a longer period invites bigger troubles. Similarly, in a software project, technical debt needs to be repaid regularly to avoid its accumulation. Large technical debt significantly degrades the quality of the software system and affects the productivity of the development team. In extreme cases, when the accumulated technical debt becomes so huge that it cannot be paid off anymore, the product has to be abandoned. This chapter emphasizes the importance of the concept of technical debt, the factors that contribute to it, and its impact on software projects.
Conference Paper
Technical Debt is a term used to classify non-optimal solutions during software development. These solutions cause several maintenance problems and hence they should be avoided or at least documented. Although there are a considered number of studies that focus on the identification of Technical Debt, we focus on the identification of Technical Debt in pull requests. Specifically, we conduct an investigation to reveal the different types of Technical Debt that can lead to the rejection of pull requests. From the analysis of 1,722 pull requests, we classify Technical Debt in seven categories namely design, documentation, test, build, project convention, performance, or security debt. Our results indicate that the most common category of Technical Debt is design with 39.34%, followed by test with 23.70% and project convention with 15.64%. We also note that the type of Technical Debt influences on the size of push request discussions, e.g., security and project convention debts instigate more discussion than the other types.
Conference Paper
Refactoring detection algorithms have been crucial to a variety of applications: (i) empirical studies about the evolution of code, tests, and faults, (ii) tools for library API migration, (iii) improving the comprehension of changes and code reviews, etc. However, recent research has questioned the accuracy of the state-of-the-art refactoring detection tools, which poses threats to the reliability of their application. Moreover, previous refactoring detection tools are very sensitive to user-provided similarity thresholds, which further reduces their practical accuracy. In addition, their requirement to build the project versions/revisions under analysis makes them inapplicable in many real-world scenarios. To reinvigorate a previously fruitful line of research that has stifled , we designed, implemented, and evaluated RMiner, a technique that overcomes the above limitations. At the heart of RMiner is an AST-based statement matching algorithm that determines refac-toring candidates without requiring user-defined thresholds. To empirically evaluate RMiner, we created the most comprehensive oracle to date that uses triangulation to create a dataset with considerably reduced bias, representing 3,188 refactorings from 185 open-source projects. Using this oracle, we found that RMiner has a precision of 98% and recall of 87%, which is a significant improvement over the previous state-of-the-art.