ArticlePDF Available

Developers’ viewpoints to avoid bug-introducing changes

Authors:
  • Federal Institute for Education, Science, and Technology of Pernamuco (IFPE)

Abstract

Context During software development, developers can make assumptions that guide their development practices to avoid bug-introducing changes. For instance, developers may consider that code with low test coverage is more likely to introduce bugs; and thus, focus their attention on that code to avoid bugs, neglecting other factors during the software development process. However, there is no knowledge about the relevance of these assumptions for developers. Objective This study investigates the developers’ viewpoints on the relevance of certain assumptions to avoid bug-introducing changes. In particular, we analyze which assumptions developers can make during software development; how relevant these assumptions are for developers; the common viewpoints among developers regarding these assumptions; and the main reasons for developers to put more/less relevance for some assumptions. Method We applied the Q-methodology, a mixed-method from the psychometric spectrum, to investigate the relevance of assumptions and extract the developers’ viewpoints systematically. We involved 41 developers analyzing 41 assumptions extracted from literature and personal interviews. Results We identified five viewpoints among developers regarding their assumptions around bug-introducing changes. Despite the differences among the viewpoints, there is also consensus, for example, regarding the importance of being aware of changes invoking high number of features. Moreover, developers rely on personal and technical reasons to put relevance on some assumptions. Conclusion These findings are valuable knowledge for practitioners and researchers towards future research directions and development practices improvements.
Developers’ Viewpoints to Avoid Bug-introducing
Changes
Jairo Souzaa,
, Rodrigo Limaa, Baldoino Fonsecaa, Bruno Cartaxob, M´arcio
Ribeiroa, Gustavo Pintoc, Rohit Gheyid, Alessandro Garciae
aFederal University of Alagoas (UFAL), Brazil
bFederal Institute of Pernambuco (IFPE), Brazil
cFederal University of Par´a (UFPA), Brazil
dFederal University of Campina Grande (UFCG), Brazil
ePontifical Catholic University of Rio de Janeiro (PUC-Rio), Brazil
Abstract
Context: During software development, developers can make assumptions
that guide their development practices to avoid bug-introducing changes. For
instance, developers may consider that code with low test coverage is more likely
to introduce bugs; and thus, focus their attention on that code to avoid bugs,
neglecting other factors during the software development process. However,
there is no knowledge about the relevance of these assumptions for developers.
Objective: This study investigates the developers’ viewpoints on the rele-
vance of certain assumptions to avoid bug-introducing changes. In particular,
we analyze which assumptions developers can make during software develop-
ment; how relevant these assumptions are for developers; the common view-
points among developers regarding these assumptions; and the main reasons for
developers to put more/less relevance for some assumptions.
Method: We applied the Q-methodology, a mixed-method from the psy-
chometric spectrum, to investigate the relevance of assumptions and extract the
developers’ viewpoints systematically. We involved 41 developers analyzing 41
assumptions extracted from literature and personal interviews.
Results: We identified five viewpoints among developers regarding their
Corresponding author at: Institute of Computing, UFAL, Brazil.
Email address: jrmcs@ic.ufal.br (Jairo Souza)
Preprint submitted to Journal of L
A
T
E
X Templates November 22, 2021
assumptions around bug-introducing changes. Despite the differences among
the viewpoints, there is also consensus, for example, regarding the importance of
being aware of changes invoking high number of features. Moreover, developers
rely on personal and technical reasons to put relevance on some assumptions.
Conclusion: These findings are valuable knowledge for practitioners and
researchers towards future research directions and development practices im-
provements.
Keywords: Bugs-introducing Changes, Developers’ Viewpoints,
Q-Methodology
1. Introduction
During software development, developers must carefully inspect committed
changes aiming to identify as many issues as possible, such as bugs, code im-
provement, alternative solutions, and more [1, 2]. However, even when devel-
opers perform a careful inspection, they can adopt assumptions that can lead5
developers to avoid (or allow) changes able to introduce bugs, also known as
bug-introducing changes [3]. By assumptions, we mean developers’ perspectives
on what introduces bugs.
For example, some developers may agree that changes with high test cover-
age are less likely to introduce bugs [4]. Others may agree that changes involv-10
ing a large number of files are more likely to introduce bugs [5]. By adopting
these assumptions, developers dedicate more or less relevance to certain fac-
tors during software development, leading developers to allow (or avoid) bug-
introducing changes. In this sense, understanding how much developers agree
(or disagree) with an assumption tends to give the most relevant factors to avoid15
bug-introducing changes. This understanding is beneficial for various reasons.
First, it could help developers to focus their efforts on relevant factors during
software development. Second, it could shed light towards improving state-
of-the-art techniques/tools for accurate identification of changes more likely to
introduce bugs.20
2
A key, prevalent challenge to understand the relevance of assumptions is to
deal with the variety of factors that can be related to bug-introducing changes.
To make matters worse, developers can diverge on the relevance of the same
assumption, thereby making it difficult to identify common viewpoints among
developers. By common viewpoints, we mean similar ranking of the relevance25
of assumptions according to the developers’ perspectives. There is little knowl-
edge about the relevance of assumptions. More recently, Cartaxo et al. [6]
analyzed how much software engineering researchers agree (or disagree) with
assumptions related to the use of rapid reviews. From this analysis, the authors
could identify common viewpoints among researchers regarding the relevance30
of these assumptions. Other studies focused on analyzing the developers’ be-
liefs and perceptions regarding different factors related to software engineering.
For instance, Meyer et al. [7, 8] investigated the developers’ perceptions about
productive work. In [9], the authors analyzed the relationship between work
practices, beliefs, and personality traits. Devanbu et al. [10] analyzed the devel-35
opers’ beliefs in empirical software engineering. Matthies et al. [11] investigated
the developers’ perceptions about agile development practices and their usage in
Scrum teams.However, none of these studies investigated the developers’ view-
points about assumptions related to bug-introducing changes.
Our study aims at discovering common viewpoints among developers about40
assumptions involving factors related to bug-introducing changes. In particular,
we investigate how relevant is each assumption for developers by analyzing how
strong developers agree (or disagree) with some assumptions. Such analysis can
help us to extract common viewpoints among developers regarding the relevance
of the assumptions. Then, we also analyze the main reasons for developers45
agree/disagree with some assumptions. Finally, we examine which assumptions
are consensus/dissensus among the developers’ viewpoints.
To perform our study, we apply the Q-methodology [12] that enables us
to systematically investigate the relevance of assumptions and extract common
viewpoints about a topic under study. Also, Q-methodology studies neither50
need large nor random samples. Actually, studies indicate that 40-60 partici-
3
pants are in an effective range [13], distinguishing it from surveys studies that
generally require a larger sample of participants for statistical significance. In
our study, we involved 41 participants that analyzed 41 assumptions extracted
from different sources, for instance, academic papers and in-person interviews55
with developers. Along with the execution of the study, we also collect com-
ments provided by developers to understand the reasons they put more (or less)
relevance to some assumptions.
In summary, the main contributions of this work are: (i) we extract a wide
range of assumptions involving a variety of factors related to bug-introducing60
changes. This might be useful for researchers who want to empirically analyze
if those assumptions actually hold in different contexts; (ii) we make available
a dataset containing the relevance that each developer puts in the assumptions.
This dataset also contains the developers’ comments describing the reasons to
agree or disagree with the assumptions. From this dataset, researchers can verify65
our results as well as to explore the data to answer other research questions;
(iii) the application of Q-methodology in the context of software engineering
can be useful to guide future work; and, finally, (iv) the viewpoints, reasons,
and consensus revealed in our study can be useful for practitioners to rethink
about their development, review, and project management practices.70
2. Q-Methodology
The Q-Methodology is a way to systematically investigate sub jectivity, such
as opinions, assumptions, beliefs, behaviors, and attitudes. This methodology
has been used in diverse areas, such as economics, agriculture, and political
science [14, 15, 16, 12]. It combines the strength of qualitative and quantitative75
research (mixed study) through a methodological bridge between them [17]. The
qualitative nature of this methodology emerges from human subjectivity, while
the quantitative portion comes from the sophisticated statistical procedures for
data analysis (e.g., correlation and factor-analysis) [18]. Figure 1 presents the
overview of our research methodology and the steps necessary for applying the80
4
Figure 1: Overview of the research methodology.
Q-Methodology [19], which will be described below.
1. Preliminary Assumptions
The first step of the Q-methodology is to gather the preliminary assumptions
referring to a particular topic. For instance, the assumption “Before accepting
pull-requests, is essential to compile and run the system with the changes to85
avoid the introduction of bugs” refers to software development process. These
assumptions can emerge from different sources, such as academic papers and
in-person interviews.
2. Q-Set
After gathering the preliminary assumptions, we define the Q-Set following90
the Q-Methodology steps. The Q-Set is the final list of assumptions after sub-
ject to refinement to express a consistent and coherent set of assumptions. The
refinement process can be conducted by removing duplicated or similar assump-
tions, fixing gaps, among others [20]. Although there is no specific number of
assumptions to compound the Q-Set, Watts et al. [13] argue that studies with95
5
Q-Set between 40 and 80 statements are considered satisfactory.
3. P-Set
The P-set represents the participants of the research. They rank each as-
sumption according to their level of agreement. This ranking process is then
used as input to identify the existent viewpoints about the research topic ana-100
lyzed. The sample of participants does not need to be large or statistically repre-
sentative of the population [13] since the results of a study using Q-Methodology
cannot be generalized to the entire population, but to specific viewpoints. In our
study, developers that are highly loaded into a viewpoint share the same per-
spective about bug-introducing changes. By developers loaded into a viewpoint,105
we mean developers that hold this viewpoint.
4. Q-Sort Structure
The Q-Sort is a grid that represents the distribution of assumptions (Q-Set)
ranked by one participant according to her/his beliefs. This grid usually follows
a structure based on a quasi-normal distribution with the number of cells varying110
according to the number of assumptions (Q-Set). Quasi-normal distribution is
often used assuming that few assumptions generate strong engagement [12].
5. Q-Sorting process
The Q-Sorting process is the phase where the participants of the study (P-
Set) rank the assumptions regarding their level of agreement following the Q-115
Sort structure, according to their perceptions.
6. Viewpoint Analysis
This step involves statistical procedures aiming to analyze the correlations
between the Q-sorts to reveal the smallest number of viewpoints, which can
explain the correlations [12]. Each Q-sort represents the participant’s beliefs120
regarding the level of agreement with some assumptions. A viewpoint represents
commons beliefs among participants about some assumptions [13]. Hence, two
participants that are loaded into the same viewpoint will have very similar
beliefs about certain assumptions. To perform the viewpoint analysis, the Q-
methodology defines the following steps:125
6.1 - Initial matrix. We structure the Q-sorts in two-dimensional matrix
6
(participants ×assumptions). The value of each cell in this matrix is the level of
agreement (or disagreement) that the participant attributed to the assumption
in the Q-Sort.
6.2 - Correlation matrix. After creating the matrix of participants and as-130
sumptions, statistical techniques (such as, Pearson [21] and Spearman [22]) are
applied to produce a matrix of correlations between participants (i. e., their
q-sorts).
6.3 - Initial viewpoints. From the correlation matrix, we extracted the ini-
tial viewpoints. A viewpoint is the weighted average Q-Sort of a group of135
participants that assigned similar values to the assumptions, i.e., it represents a
hypothetical participant that best represents how those with similar viewpoints
would sort the assumptions [19]. The intuition behind this step relies on: Q-
sorts which are highly correlated with one another may be considered to have
a viewpoint similarity.140
To produce the initial viewpoints, reduction techniques, such as Centroid
Method (CM) and Principal Component Analysis (PCA) [23], can be used to
produce a grouping of similar viewpoints based on the correlations. After pro-
ducing these viewpoints, a variety of measures can be used to evaluate the
variance associated to these viewpoints, such as: (Eigenvalue) it indicates the145
level of variance existing in each viewpoint [24]; (Explained Variance) it rep-
resents the percentage of total variance calculated for each viewpoint divided
by the total of participants. This represents the distribution of the variance
existing in each viewpoint; and (Cumulative Explained Variance) it represents
the sum of the variance among the viewpoints. These measures will be essential150
to select the final viewpoints.
The correlation of each Q-Sort with each viewpoint is given by the viewpoint
loadings, which range from -1 to +1. The higher the absolute value of loading
(i.e., the correlation), the more the Q-sort (which represents the developer’s
beliefs) is associated to the viewpoint [19].155
6.4 - Viewpoints rotation. Although several viewpoints can be extracted
from the Q-Sorts correlation matrix, few viewpoints are capable of explaining
7
the most variance of the matrix. The rotation of viewpoints aims at identifying
these viewpoints. As a result, we obtain an increase in the strength of the cor-
relations between some participants (Q-Sorts) and viewpoints. Before rotating160
the viewpoints, we need to define which ones should be rotated. To do that,
the Q-Methodology suggests some criteria to determine how many viewpoints
to retain. The most common method of choosing the number of viewpoints
to retain is named Guttman Rule [25] that requires to retain viewpoints whose
eigenvalues are higher than 1. This threshold is based on a requirement that a165
retained viewpoint with eigenvalue lower than 1 explains less variance than a
single Q-Sort (participant) [24]. Other criteria is to define that the cumulative
explained variance of each viewpoint should be above 40%, which is the pro-
portion of the assumptions that are explained by the viewpoints [13, 19]. In
case the criteria are not enough to select a reduced number of viewpoints to170
rotate, the Q-Methodology supports the use of qualitative criteria. This leaves
the researcher free to consider any solution they consider theoretically informa-
tive [26, 13]. To proceed with the rotation of the viewpoints, mathematically
optimal (such as as varimax) or manual (judgemental) techniques [26] can be
used. The choice will depend on the nature of the data and upon the aims of175
the investigator to reveal the most theoretically informative solution [12].
6.5 - Flagging. After rotating the viewpoints, the participants with signifi-
cant loading are flagged, i.e., the participants (Q-Sorts) most representatives for
each viewpoint. In practice, when flagging, one aims at maximizing differences
between viewpoints [19]. To do that, the Q-methodology suggests to apply au-180
tomatic flagging based on criteria that consider the loading significance and the
statistical significance of the correlations between the participants (Q-sort) and
the viewpoints.
6.6 - Z-scores and viewpoint scores. In the last step of viewpoint analysis,
the focus is to investigate the relation between the assumptions and viewpoints.185
Initially, the Q-methodology defines the analysis of the ranking of assumptions
within each viewpoint (also known as viewpoint score) through the use of z-
scores. It consists in the number of standard deviations by a data point is
8
above or below the mean in a distribution [27]. In particular, viewpoint score
indicates the assumption’s relative position within the viewpoint. To obtain the190
viewpoint score for each assumption and viewpoint, the z-score is the weighted
average of the ranks attributed to the assumption by the flagged developers (Q-
sorts) loaded into the viewpoint [28]. In practice, the viewpoint scores enable
us to perceive how strong developers loaded in a viewpoint agree (or disagree)
with a specific assumption.195
7. Viewpoint Interpretation.
This last step consists of interpreting the viewpoints resulting from the pre-
vious analysis aiming to provide a rationale that explains each identified view-
point. It involves the production of a series of summarizing reports, each of
which explains the viewpoint, assumptions, and their ranks [13]. By interpret-200
ing the viewpoints, it is also possible to find interesting information regarding
consensuses and dissensions among viewpoints.
Why Q-Methodology?
We believe the Q-Methodology is more adequate in the context of this
study because it reveals much more than a simple list of assumptions. A205
Q-methodology study provides, as a result, a typology of viewpoints about a
specific topic. With such typology in mind, we can frame any developer ac-
cording to one of the identified viewpoints and thus understand in detail why
that developer behaves and acts in a certain way when they are trying to avoid
bug-introducing changes during software development. That would be useful to210
better address her/his beliefs that are not supported by scientific evidence.
3. Research Method
This study aims to provide qualitative evidence about the developers’ view-
points regarding assumptions involving factors related to bug-introducing changes.
To do that, we intend to answer three research questions:215
RQ1.What are the developers’ viewpoints regarding bug-introducing changes?
9
This research question investigates the common viewpoints among the de-
velopers regarding the assumptions involving factors related to bug-introducing
changes. To do that, we apply the Q-methodology to systematically analyze220
how strong developers agree or disagree with each assumption and, then, to
recognize the developers’ viewpoints.
RQ2.What are the developers’ reasons to agree with an assumption?
Besides recognizing the developers’ viewpoints, we also investigate the main225
reasons for developers to agree (or disagree) with an assumption. To do that,
we ask developers to comment on why they agree or disagree with some assump-
tions.
RQ3.Which assumptions are consensus/dissensus among developers’ view-230
points?
Finally, we analyze the assumptions that are consensus/dissensus between
the developers’ viewpoints. By dissensus, we mean the lack of consensus.
3.1. Preliminary Assumptions235
To collect the preliminary assumptions, the first three authors applied an in-
ductive (unstructured) approach [13]. First, we analyzed 167 academic research
papers. We collected those papers from popular scientific databases (such as
Google Scholar and DBLP) by searching for keywords, such as bugs, software
changes, faults, defects, and bug-introducing changes. In particular, we com-240
bined those keywords to elaborate queries, such as ”bugs and changes”, and
”defects and changes”. We used general keywords aiming to collect papers in-
volving a variety of assumptions. We stopped when the papers retrieved from
these scientific databases do not present any relation with factors related to the
introduction of bugs. Even though we have performed a broad search of papers,245
there still might exist papers not considered in our study. Finally, we extracted
61 assumptions by analyzing the results and discussions of those papers. In par-
10
ticular, we tried to identify if the results or discussions indicated factors that
may lead developers to allow (or avoid) bug-introducing changes. In affirmative
case, we elaborated an assumption involving the factor and its influence in the250
introduction of bugs.
For example, in [29], the authors suggest that the existence of code smells
is an indicator of fault-proneness. We elaborated the following assumption
“Changes containing code smells are more likely to introduce bugs”. Besides
analyzing academic papers, we also talked with 10 developers to discuss about255
factors that may avoid bug-introducing changes. Based on the discussions, we
elaborated 29 assumptions. As a result of the analysis of all these sources, we
obtained 90 preliminary assumptions involving different factors that may influ-
ence the introduction of bugs, as described in the accompanying web page [30].
3.2. Q-Set260
To define the Q-set, we considered the 90 preliminary assumptions and val-
idated them by following a method adopted in a previous study [20]. First, we
analyzed the redundant assumptions, and selected the assumptions that they
considered the information more clear and objective. For example, consider the
assumptions “Understanding the software history is mandatory for developers to265
avoid the introduction of bugs” and “The analysis of history about the changes
performed can be timely and valid to avoid the introduction of bugs”. Note
that both are related to the use of software history to avoid bug-introducing
changes. Thus, we selected the assumption “Understanding the software his-
tory is mandatory for developers to avoid the introduction of bugs” to analyze in270
our study. Next, we analyzed the ambiguity among assumptions. Three experi-
enced professionals, with more than 10 years of research and practice experience
in software development were responsible for that analysis. We asked them to
answer the following questions:
Is the assumption not ambiguous from the point of view of a275
developer? We removed seven assumptions that present ambiguity ac-
cording to the professionals;
11
Do you have any suggestions to improve the assumption? The
professionals could suggest writing improvements to the assumption under
analysis. They suggested some grammatical and semantic improvements;280
Are there other assumptions expressed in the sources (litera-
ture, forums, discussions...) you would suggest that are not rep-
resented in the preliminary assumptions? The professionals could
suggest assumptions that have not been considered. They suggested eight
assumptions from existing studies [11, 10].285
Finally, the authors had an online meeting to discuss about the final as-
sumptions and apply minor improvements. Thereby, we ended up with a Q-Set
with 41 assumptions, where 30 (73%) are from the literature, and 11 (27%) from
in-person interviews. Most of the assumptions analyzed in our study are based
on the literature. On the other hand, if we had considered only assumptions290
empirically validated by previous studies, we could ignore important assump-
tions that have been considered by developers but not empirically validated yet.
For example, the assumption “Before accepting pull-requests, it is mandatory
to compile and run the system with the changes to avoid the introduction of
bugs” (A17) based on the discussions during the in-person interviews.295
3.3. P-Set
Initially, we involved 10 developers from the local industry. Then, we cre-
ated a list of developers with whom the authors have already had contact in
the industry or academia. We ended up with a list containing 72 developers
from different continents (South and North America as well as Europe) and300
backgrounds (industry and academia). We sent them individual emails and we
obtained 31 answers from developers in South and North America. Thereby, we
ended up with 41 developers.
After completing the Q-Methodology process, the participants answered a
characterization questionnaire containing questions regarding their educational305
level. Also, they answered their experience in bug fixing and reporting, code
12
review, software development. The answers indicate that 53% of the partici-
pants are from industry, and 47% are from academia. More than 70% of the
participants have high or very high experience in software development, and
51% have high or very high experience in bug fixing. We deliberately involved310
participants who have experience with important activities related to our study,
and participants that work part-time in academia and industry.
3.4. Q-Sort Structure
Our Q-Sort structure is composed of 41 cells, and the assumptions can be
ranked in one of the seven columns between -3 and +3, as depicted in Figure 1.315
The number in parenthesis at the top of each column indicates the number of
assumptions that should be ranked on a column. For instance, participants have
to rank two assumptions as +3, the ones they agree the most.
3.5. Q-Sorting process
With the Q-sort structure defined, the participants (P-Set) can perform320
the Q-Sorting process. They rank the assumptions according to their level of
agreement or disagreement following the Q-sort structure. The participants
perform all the demanded tasks, which Q-Sorting is one of them, though an
online web tool.1
First, we conducted a pilot study involving ten participants. During the325
pilot, we encouraged them to provide feedback regarding their understanding
and time to complete the tasks, as well as the usability of the web tool, including
any notable bugs. They took about 20 minutes to conclude the tasks. We did
not observe and none of the participants reported relevant issues while they
executed the experiment, so we decided to consider their answers in the results330
of this research. After performing the pilot study, we sent individual emails to
72 developers, including a URL containing an invitation letter. We obtained
answers from 31 developers, giving a total of 41 developers. The tasks the
1https://bic.netlify.com
13
participants have to perform are divided into a 3-phase flow containing five
steps, detailed as follows.335
Welcome Page: In phase 1, we present a welcome page to the participant
containing a brief description about this study. At this point, we described the
motivation of our study, and we explained some concepts, i.e., bug-introducing
changes. We also provide instructions about estimated time, support contact
and a list of supported browsers by our tool;340
Q-sorting: Next, in phase 2, the participant did the ranking (Q-sorting)
of the assumptions according to their agreement or disagreement. Initially, we
asked the participant to drag and drop 41 cards, each containing one assumption
of the Q-Set, into three piles — Agree, Disagree and Neutral — as depicted in
Figure 2.a. The cards are presented in random order to avoid bias.345
After defining the piles, we asked the participants to rate the assumptions
according to the Q-Sort Structure, as shown in Figure 2.b. Each assumption
must be associated with only one level of agreement. However, the participants
could rearrange the assumptions as often as they wished. Once the participant
is satisfied with the distribution, that distribution (Q-Sort) is associated to350
him/her.
Once the participants defined their Q-sort, we asked them to explain why
they ranked the assumptions on the extremes (i.e., the ones ranked as -3 and the
ones as +3). The goal is to have qualitative evidence for further analysis aiming
to understand why participants ranked certain assumptions on the extreme.355
Final Page: After completing the assumptions raking, the participants
answered a characterization questionnaire in phase 3. It contains questions re-
garding their educational level, previous job, as well as their experience in issues
related to bugs, i.e., bug fixing and reporting, code review, software develop-
ment, and test. Participants’ response are only counted in the final P-Set if360
s/he passed through all the six steps.
14
(a) Classifying assumptions in three levels of agreement. The white card on the top represents
the assumption that the participant should drag and drop to one of the colored box.
(b) Ranking the assumptions in seven agreement levels according to the Q-Sort structure.
Figure 2: Main steps of the Q-sorting process.
15
3.6. Conducting the Viewpoint Analysis
From the Q-Sorts of the 41 participants, we conducted the viewpoint anal-
ysis using the Ken-Q Analysis Web Application2, which performs the process
of viewpoint analysis of Q-Methodology. Figure 1 presents the steps of the365
viewpoint analysis and decisions related to each step. We describe the artifacts
produced in each step in the accompanying web page [30].
Initial Matrix. We organized the Q-sorts in a two-dimensional matrix (par-
ticipants ×assumptions) containing the values that the participants attributed
to the assumptions in the Q-Sort(e.g., range from -3 to +3), as described in the370
section Data Matrix (Q-sorts) in the accompanying web page [30].
Correlation matrix. Once we create the matrix of participants and as-
sumptions, we used Pearson’s correlation that is provided by Ken-Q Analysis
Web Tool to calculate the correlation coefficient among the participants (i.e.,
their q-sorts).375
As a result, we obtained a full correlation matrix (see section Correlations
between Q-sorts in the accompanying web page [30]), where a perfect positive
correlation is registered as 1 and a perfect negative correlation is -1. This
correlation matrix was used as input to extract the initial viewpoints on the
next step.380
Initial viewpoints. We extracted the initial viewpoints within the correla-
tion matrix from the previous step. Table 1 contains the initial set of viewpoint
loadings for each of the 41 Q-sorts (developers). Table 1 contains the initial set
of viewpoint loadings for each of the 41 Q-sorts (developers). For instance, the
Q-sort related to the Developer #1 present viewpoints loadings varying between385
-0.03 and 0.47. These viewpoint loadings indicate that Q-sort related to the
Developer #1 is more associated with the Viewpoint E (0.47). Table 1 also de-
scribes the Eigenvalue,Explained Variance, and Cumulative Explained Variance
of each viewpoint.
Viewpoints rotation. Note that all the viewpoints have an eigenvalue390
2https://shawnbanasick.github.io/ken-q- analysis/
16
Table 1: Initial Set of Viewpoints
Developer’s ID (Q-Sort) Viewpoints
A B C D E F G H
#1 0.46 0.09 -0.03 0.30 0.47 0.09 -0.10 -0.17
#2 0.54 -0.04 0.09 -0.15 0.04 0.01 0.34 -0.34
#3 0.57 -0.22 -0.07 -0.11 -0.09 0.30 -0.22 -0.31
#4 0.56 -0.12 -0.04 0.18 0.18 -0.01 -0.26 -0.03
#5 0.60 -0.20 -0.26 -0.10 0.34 0.20 -0.20 0.03
#6 0.56 -0.10 0.31 0.42 0.16 -0.10 0.31 -0.09
#7 0.42 0.45 0.10 0.01 -0.27 -0.08 0.32 -0.09
#8 0.33 -0.15 -0.32 -0.16 0.37 0.03 0.14 0.29
#9 0.33 0.16 -0.42 0.16 0.10 0.57 0.15 0.14
#10 0.55 0.03 -0.12 -0.08 -0.33 0.07 -0.08 0.51
#11 0.32 0.28 0.10 0.26 0.46 -0.37 0.05 0.08
#12 0.49 0.03 0.11 -0.41 0.18 0.16 0.29 -0.36
#13 0.62 -0.09 -0.16 -0.08 -0.37 0.13 0.20 -0.23
#14 0.50 0.35 0.07 0.17 -0.16 -0.10 -0.41 -0.08
#15 0.50 0.15 -0.05 -0.13 0.36 -0.27 0.03 -0.10
#16 0.55 0.13 0.23 0.20 0.20 -0.24 -0.29 0.03
#17 0.57 -0.04 0.29 0.14 -0.01 -0.20 0.34 0.28
#18 -0.15 0.33 0.11 0.63 -0.15 0.11 -0.07 -0.26
#19 0.47 -0.17 0.27 -0.42 0.00 -0.12 0.10 0.29
#20 0.34 0.36 0.55 -0.25 -0.25 0.35 0.08 0.20
#21 0.42 0.01 0.20 0.45 -0.29 0.34 -0.04 0.09
#22 0.43 0.23 -0.11 -0.53 -0.05 -0.20 -0.16 -0.09
#23 0.18 0.04 0.66 -0.05 0.41 0.10 -0.06 0.16
#24 0.31 -0.35 0.38 -0.20 -0.03 0.35 -0.15 0.13
#25 0.05 0.06 -0.49 0.48 0.01 0.12 0.39 0.18
#26 0.58 -0.10 0.15 -0.16 -0.12 0.30 0.25 -0.16
#27 0.65 -0.25 0.04 0.26 -0.21 -0.03 0.10 0.23
#28 0.45 0.38 -0.19 -0.10 -0.07 0.29 -0.20 -0.16
#29 0.44 0.02 -0.32 0.06 -0.40 -0.33 -0.27 0.07
#30 0.56 0.22 -0.32 -0.16 -0.21 0.01 0.08 0.23
#31 0.29 -0.70 0.26 0.27 -0.28 0.02 0.01 -0.11
#32 0.53 0.26 0.03 -0.05 -0.09 0.06 -0.40 0.09
#33 0.65 0.26 -0.05 0.07 0.05 0.05 -0.26 -0.21
#34 0.31 -0.73 -0.08 0.21 -0.09 -0.02 -0.25 -0.04
#35 0.37 0.26 0.03 -0.03 0.01 -0.21 -0.09 0.20
#36 0.25 -0.13 -0.32 0.00 0.55 0.30 0.06 0.12
#37 0.59 -0.31 -0.29 -0.20 0.04 -0.33 0.02 -0.12
#38 0.34 -0.10 0.53 0.09 0.11 -0.14 -0.01 -0.03
#39 0.40 0.45 -0.07 0.08 -0.17 -0.25 0.32 -0.11
#40 0.65 0.02 -0.19 0.23 0.15 0.02 0.06 0.07
#41 -0.48 0.44 0.19 0.03 0.21 0.42 -0.11 0.09
Eigenvalue 9.03 3.07 2.81 2.51 2.40 2.09 1.87 1.52
% Explained Variance 22 7 7 6 6 5 5 4
% Cumulative Explained Variance 22 29 36 42 48 53 58 62
17
above 1. The Viewpoint H has the lowest eigenvalue equal to 1.52. Thus, the
Guttman Rule does not help us to determine how many viewpoints to retain.
On the other hand, only when we consider at least four viewpoints, we obtain a
cumulative explained variance higher than 40%. Note that we obtain a cumu-
lative explained variance of 42% when we consider the viewpoints A, B, C, and395
D. Then, the application of these constraints leads us to retain four or more
viewpoints. Although the constraints have reduced our scope of viewpoints, we
still have to choose to rotate between four and eight viewpoints. At this point,
we use qualitative criteria.
We need to consider viewpoints that we can interpret the reasoning behind400
the developers’ beliefs loaded in each viewpoint. After analyzing the solutions
of viewpoints, we observed that the number of distinguished assumptions that
compose each viewpoint is decisive to interpret a viewpoint. A distinguishing
assumption is an assumption ranked in a viewpoint that significantly differs
from its rank in all other viewpoints. However, in viewpoints composed of a405
high or low number of distinguishing assumptions, it is more difficult to inter-
pret the reasoning behind the developers’ beliefs loaded in the viewpoint. We
also observed that the solution containing only four viewpoints is composed of
many distinguishing assumptions. On the other hand, the solutions above six
viewpoints are composed of few distinguishing assumptions. Hence, we decided410
to consider a solution with five viewpoints. Although we have selected a solution
with five viewpoints to perform the remaining steps of the Q-Methodology, all
the solutions of viewpoints are available in the accompanying web page [30].
Once we select five viewpoints to be rotated, we performed the rotation
by using the varimax rotation, which statistically positions the viewpoints to415
cover the maximum amount of variance, and ensures that each Q-Sort has a
high viewpoint loading to only one viewpoint. We preferred a mathematical
rotation since it makes theoretical sense for us to pursue a rotated solution that
maximizes the amount of variance explained by the extracted viewpoints [13].
Flagging. After rotating the viewpoints, we flagged the participants with420
significant loading, i.e., the participants (Q-Sorts) most representatives for each
18
Table 2: Rotated Viewpoints and Flagged participants
Developer’s ID Viewpoint A Viewpoint B Viewpoint C Viewpoint D Viewpoint E
# Defining participants 11 5 6 3 4
#1 0.1317 0.0107 0.5681 -0.1233 0.4178
#3 0.2774 0.4562 0.0694 0.1855 0.2656
#5 0.1795 0.2834 0.1791 0.0777 0.6583
#6 0.1217 0.3369 0.705 0.0325 0.0615
#7 0.6357 -0.0141 0.1688 0.0635 -0.1584
#8 0.0502 0.0984 0.0193 0.0271 0.6174
#10 0.5261 0.3814 -0.0179 0.0843 0.0697
#11 0.1502 -0.2067 0.5685 -0.0767 0.2645
#14 0.5653 0.0802 0.3127 -0.0331 -0.0587
#15 0.2957 -0.0486 0.2827 0.1655 0.4694
#16 0.2895 0.1099 0.567 0.1059 0.1503
#19 0.1914 0.2443 0.0738 0.6049 0.1826
#22 0.5198 -0.0323 -0.1893 0.3721 0.2967
#23 -0.1261 -0.1822 0.5784 0.5051 0.0037
#24 -0.0618 0.347 0.1602 0.5016 0.0093
#25 0.0906 0.1043 0.0555 -0.6533 0.1471
#27 0.2909 0.6312 0.3264 0.0012 0.056
#28 0.6001 -0.0356 0.0435 -0.0086 0.1906
#29 0.4943 0.4036 -0.1238 -0.1718 0.0464
#30 0.6518 0.2002 -0.0959 -0.0099 0.2542
#31 -0.2335 0.8025 0.2123 0.119 -0.1667
#32 0.5351 0.0945 0.2009 0.1321 0.0977
#33 0.5544 0.1235 0.3337 0.0176 0.2434
#34 -0.2451 0.7627 0.0833 -0.0331 0.1654
#35 0.3941 -0.0215 0.1871 0.0861 0.0942
#36 -0.0821 -0.0021 0.1612 -0.1021 0.662
#38 -0.0017 0.1673 0.5089 0.3641 -0.0824
#39 0.6106 -0.0327 0.1546 -0.0946 -0.0242
#41 -0.1431 -0.6514 0.0554 -0.0567 -0.23
viewpoint. To do that, we performed the automatic flagging, which is based on
three criteria: (i) the loading value should be significantly high to 1.96 times
to the standard error of a zero-order loading (in our case, for a p-value <.05);
(ii) the square loading for a viewpoint should be higher than the sum of the425
squared loadings for all other viewpoints [12]; and, (iii) the correlation between
the participant (Q-sort) and the viewpoint is statistically significant with p
19
value < 0.05. As a result of the rotation and flagging, we obtained Table 2,
which describes the loading values between the participants and the five rotated
viewpoints. We present the significant loading values in blue . Note that 29 of430
the 41 participants present a significant loading (70% of the participants), and
no participant was loaded significantly in two or more viewpoints. Finally, we
also describe the number of participants loaded in each viewpoint.
Z-scores and viewpoint scores. At this point, our focus is to investigate
the relation between the assumptions and viewpoints. Initially, we analyzed the435
ranking of assumptions within each viewpoint ( viewpoint score ) through the
use of z-scores. As a result, we obtained Table 3 that describes the viewpoint
scores between the assumption and viewpoint. These viewpoint scores indicate
how strong developers loaded in a viewpoint agree (or disagree) with a specific
assumption. For example, developers loaded in Viewpoint E strongly agree440
(+3) that “Understanding the rationale behind changes is mandatory to avoid
the bugs introduction” (Assumption #1).
Table 3: Assumptions and viewp oint scores. The highlighted cells in green and
red indicate the extreme scores of each viewpoint, whereas an “**” and “*” is
used to denote the distinguishing assumptions of each viewpoint indicating signifi-
cance at P<.01 and P<.05 respectively. Consensus assumptions are in bold, while
dissensions assumptions are in italic
# Assumptions Viewpoints Scores
A B C D E
1 Understanding the rationale behind changes is mandatory to avoid
the bugs introduction. [31]
+1* 0 0 +2 +3
2 Changes that neglect non-functional requirements are more likely
to introduce bugs. [32]
0 +2* -2 0* -1
3 Having documentation associated with source code is decisive for
developers to avoid bug-introducing changes on it. [5]
0 +1 0 0 +2**
4 Familiarity of developers with source code is mandatory to avoid
bug-introducing changes. [33]
0 0 +2** -1 +3
5 Changes involving a large number of files are more likely to intro-
duce bugs. [5, 34]
0* +2* 0 +2 +1
6 When reviewing a change, reviewers need to be aware of all
artifacts impacted by this change to avoid the introduction of
bugs. [35]
+1 +1 -1 +3** 0
7Changes invoking a high number of features (methods,
fields, classes) are more likely to introduce bugs. [36]
+1 +3 +1 +2 +1
20
8 Changes with high test coverage introduce less bugs. [4] +2** -1 +1 +1 0
9 Having a testing team is mandatory to avoid bug-introducing
changes. [37]
-2* +2 +3** 0 0*
10 Automated testing reduces considerably the introduction of bugs. +3** 0 +1 +1 0
11 Manual tests verify which changes meet user requirements. There-
fore, manual testing is mandatory to avoid bugs. [38]
-1 +1** -1 -2 -2
12 Continuous integration is mandatory to avoid bug-introducing
changes.
+1 0 0 -1 -2
13 Using software quality assurance (SQA) metho ds in the code re-
view process is decisive to avoid bug-introducing changes.
+2 +2 0 -1 +1
14 Changes that break the software architectural design are more
likely to introduce bugs. [39]
+1 0 +1 -1* 0
15 Changes containing code smells are more likely to introduce
bugs. [29]
+2** -1 -1 -2 -1
16 Avoiding recurring changes reduce the intro duction of bugs. [40] 0 0 -2 +1 -2
17 Before accepting pull-requests, it is mandatory to compile
and run the system with the changes to avoid the intro-
duction of bugs.
0 +3 +3 +3 +1
18 Familiarity of developers with the programming language
adopted in a project is mandatory to avoid bugs.
+1 -1 -2** -1 +2
19 Code reviews reduce considerably the introduction of bugs. [34] +2 0** +2 -2** +2
20 Developers working on legacy code introduce more bugs. 0 +1 +1 -1 0
21 Adopting pair programming is decisive to avoid bug-introducing
changes.
-1 -1 -1 0 +1
22 Agile methods aim at reducing the delivery lifecycle. Therefore,
when adopting these methods, developers are more likely to per-
form bug-introducing changes. [41]
-2 -1 +1** -1 -1
23 Code snippets frequently changed are hotspots of bugs. [42] 0 -1** 0 +1 +2
24 Understanding the software history is mandatory for developers
to avoid the introduction of bugs. [43]
-3** +2** -1 0 -1
25 Code reuse avoids the introduction of bugs. [44] +1 -2 +2 +1 -3
26 Handling the exceptions associated with a change is mandatory
to avoid the introduction of bugs. [45]
0 0** -1 -2** 1
27 Using static analysis tools is decisive to avoid bug-intro ducing
changes. [46]
-1 0 -1 -1 -1
28 Experienced developers introduce less bugs. [47] +3** -2** +1 +1 +1
29 Fixing bugs is riskier (more likely to introduce bugs) than adding
new features. [48]
-2 +1 -3** 0 -3
30 Floss refactorings tend to introduce bugs. This refactoring con-
sists of refactoring the code together with non-structural changes
as a means to reach other goals, such as adding features or remov-
ing bugs.[49]
0 0 -2 -2 0
31 Root-canal refactorings are less prone to introduce bugs. This
refactoring is used for strictly improving the source code structure
and consists of pure refactoring.[49]
-1 -2 0 +1* -1
21
32 Adaptive changes are more likely to introduce bugs. These
changes aim at migrating systems to a new version of a third
party API. [50]
0 -1 0 0 0
33 Developers working on their own code are less likely to introduce
bugs [51, 47]
-1* -2** 0 +1 +2*
34 Merge commits introduce more bugs than other commits. [10] -2 +1 0** +2 -2
35 Conflicting changes introduce more bugs than non-conflicting
ones.
+2 0 +2 0 0
36 Developers working on code snippets containing third-party APIs
tend to introduce bugs. [50]
-1 -1 -2 0** -1
37 Introduction of bugs depends on which programming lan-
guage is used. [52]
-3 -3 -3 -3 -1**
38 Geographically distributed teams introduce more bugs than teams
that are not geographically distributed. [10]
-2 -2 -1 +2** -2
39 Developer’s specific experience in the project is more decisive to
avoid the introduction of bugs than overall general exp erience in
programming. [10]
-1 +1 +2 0 +1
40 Code written in a language with static typing (e.g., C#) intro-
duces fewer bugs than code written in a language with dynamic
typing (e.g., Python). [53]
-1 -3 +1* -3 0
41 Code snippet changed by different developers is hotspot of
bugs. [54]
+1 +1 0 0 0
4. Results and Discussions
In this section, we describe the main results to answer the research questions445
analyzed in our study.
4.1. RQ1. What are the developers’ viewpoints regarding bug-introducing changes?
In Section 3, we described how we applied the Q-methodology to analyze
the developers’ beliefs aiming at identifying common viewpoints. As a result,
we identified five viewpoints (named A, B, C, D, and E). Now, RQ1intends450
to interpret and discuss those viewpoints by following the procedure described
in [13]. From this interpretation, we expect to understand the beliefs of groups of
developers on the relevance (in terms of the level of agreement or disagreement)
of certain assumptions. In particular, we focus on interpreting the beliefs of the
groups of developers that compose the viewpoints A-E. Initially, three authors455
22
worked separately to interpret the viewpoints. Then, they worked together to
produce the final interpretation of the viewpoints A, B, C, D, and E. For each
viewpoint, we described the developers’ background and experience loaded into
the viewpoint. We also presented the assumptions scores of each viewpoint.
For instance, (A10: +3), which means that for the viewpoint under analysis,460
assumption number 10 is ranked as +3. Table 4 summarizes the assumptions
statistically related to the five viewpoints (A, B, C, D, and E), and the ranks
attributed by the developers loaded into each viewpoint.
Table 4: Assumptions and Ranks in Viewpoints A, B, C, D and E
Viewpoint Assumptions Rank
A
(A10) Automated testing reduces considerably the introduction of bugs. +3
(A28) Experienced developers introduce less bugs. +3
(A8) Changes with high test coverage introduce less bugs. +2
(A15) Changes containing code smells are more likely to introduce bugs. +2
A
(A1) Understanding the rationale behind changes is mandatory to avoid the bugs
introduction.
+1
(A5) Changes involving a large numb er of files are more likely to introduce bugs. 0
(A33) Developers working on their own code are less likely to introduce bugs. -1
(A9) Having a testing team is mandatory to avoid bug-intro ducing changes. -2
(A24) Understanding the software history is mandatory for developers to avoid
the introduction of bugs.
-3
B
(A24) Understanding the software history is mandatory for developers to avoid
the introduction of bugs.
+2
(A5) Changes involving a large numb er of files are more likely to introduce bugs. +2
(A2) Changes that neglect non-functional requirements are more likely to introduce
bugs.
+2
(A11) Manual tests verify which changes meet user requirements. Therefore, man-
ual testing is mandatory to avoid bugs.
+1
(A27) Using static analysis tools is decisive to avoid bug-introducing changes. 0
(A19) Code reviews reduce considerably the introduction of bugs. 0
(A23) Code snippets frequently changed are hotspots of bugs. -1
(A33) Developers working on their own code are less likely to introduce bugs. -2
(A28) Experienced developers introduce less bugs. -2
C
(A9) Having a testing team is mandatory to avoid bug-intro ducing changes. +3
(A4) Familiarity of developers with source co de is mandatory to avoid bug-
introducing changes.
+2
(A22) Agile methods aim at reducing the delivery lifecycle. Therefore, when adopt-
ing these methods, developers are more likely to perform bug-introducing changes.
+1
(A40)Code written in a language with static typing (e.g., C#) introduces fewer
bugs than code written in a language with dynamic typing (e.g., Python).
+1
(A34) Merge commits introduce more bugs than other commits. 0
23
(A18)Familiarity of developers with the programming language adopted in a
project is mandatory to avoid bugs.
-2
(A29) Fixing bugs is riskier (more likely to introduce bugs) than adding new fea-
tures.
-3
D
(A6) When reviewing a change, reviewers need to be aware of all artifacts impacted
by this change to avoid the introduction of bugs.
+3
(A38) Geographically distributed teams introduce more bugs than teams that are
not geographically distributed.
+2
(A31) Root-canal refactorings are less prone to introduce bugs. This refactor-
ing is used for strictly improving the source code structure and consists of pure
refactoring.
+1
(A36) Developers working on code snippets containing third-party APIs tend to
introduce bugs.
0
(A2) Changes that neglect non-functional requirements are more likely to introduce
bugs.
0
(A14) Changes that break the software architectural design are more likely to
introduce bugs.
-1
(A19) Code reviews reduce considerably the introduction of bugs. -2
(A26) Handling the exceptions associated with a change is mandatory to avoid the
introduction of bugs.
-2
E
(A4) Familiarity of developers with source co de is mandatory to avoid bug-
introducing changes.
+3
(A33) Developers working on their own code are less likely to introduce bugs. +2
(A9) Having a testing team is mandatory to avoid bug-intro ducing changes. 0
(A37) Introduction of bugs depends on which programming language is used. -1
4.1.1. Viewpoint A: No matter the software history, only tests and developers’465
experience are important
Developers Characterization. Eleven out of 41 developers significantly
loaded into this viewpoint, which corresponds to 26% of all developers. Most of
these developers loaded into viewpoint A (64%) are from industry, and at least
45% of them have high experience in software development, tests, bug fixing470
and code review.
Viewpoint interpretation. When we aim at reducing the introduction
of bugs, the high test coverage (A08) plays an important role in such reduc-
tion, mainly if the tests are performed by experienced developers (A28) in an
automatic way (A03). Regardless of the developer’s understanding of software475
history (A24).
The developers loaded into this viewpoint (Developers in A) strongly agree
24
that “Experienced developers introduce less bugs” (A28: +3). This belief re-
inforces a previous study [47] that indicates developers’ experience plays an
important role in the reduction of bugs. Similarly, the developers strongly agree480
that “automated testing reduces considerably the introduction of bugs” (A10:
+3). Only in this viewpoint, the developers strongly agree with both assump-
tions A28 and A10, indicating they put substantial faith in the influence of the
developer’s experience and tests in the reduction of bugs. Still regarding tests,
Developers in A also agree that “changes with high test coverage introduce485
less bugs” (A08: +2), differently from previous study [4] that has not found
any statistically significant relation between code coverage and bugs. We also
observe that Developers in A are concerned with source code quality. In par-
ticular, they believe in the influence of code smells in the increase of bugs (A15:
+2), reinforcing a study [29] that suggests the existence of code smells is an490
indicator of fault-proneness. In the end, the developers agree it is mandatory
to comprehend the rationale behind the changes to reduce the introduction of
bugs (A1: +1), confirming a previous study [31] that found that the rationale
of a change is the most important information for change understanding.
Although Developers in A defend the importance of understanding the495
rationale behind a change, they strongly disagree that it is mandatory to un-
derstand software history (A24: -3). Only in Viewpoint A, developers present
this strong disagreement regarding software history. Note also that even though
Developers in A put substantial faith on tests to avoid bug-introducing changes,
they disagree it is mandatory to have a testing team (A09: -2). This belief re-500
inforces the findings of studies [55] indicating that companies, such as Google
and Facebook, obtained some benefits when they decided not having a testing
team. For example, those companies could decrease the time-to-deployment,
and they perceived that developers felt more committed to producing quality
code. Finally, there is a contradiction in the literature regarding the influence505
of developers work on their own code or not. While a recent study, Tufano
et al. [51] suggest that source code in which different developers are involved
are more related to bugs, another study [47] indicate that buggy code is more
25
strongly associated with a single developer’s contribution. Developers in A
reinforce this study [47] by disagreeing that developers working on their own510
code may reduce the introduction of bugs (A33: -1).
Neutral. An existing study [3] has found that bug-introducing changes are
roughly three times larger (involving a high number of files) than other changes.
However, Developers in A do not present a position regarding this issue (A5:
0).515
4.1.2. Viewpoint B: Do not Despise Its History Regardless of Your Experience
and Ownership
Five out of 41 participants significantly loaded on this viewpoint, which
corresponds to 12% of all participants. Most of the participants loaded into
viewpoint B (80%) work in the industry, and have high experience in software520
development and bug fixing.
Viewpoint interpretation: To avoid bugs, it is mandatory to understand
the software history (A24), focusing on changes that neglect non-functional
requirements (A2) or involve a large number of files (A5). Regardless of the
developer’s experience or if they are working on their own code (A33).525
Agree. The developers loaded in this viewpoint (Developers in B) agree
with “understanding the software history is mandatory for developers to avoid
the introduction of bugs” (A24:+2), reinforcing existing studies [43] that suggest
the software history as a promising way to predict bugs. Regarding the scope
of the software history, developers concern with the changes involving a large530
number of files (A05: +2) or neglecting non-functional requirements (A02: +2).
Both factors have already been evidenced in previous studies [51, 32]. While
a study [51] present a statistically significant relation between large changes
(involving a high number of files) and bugs, another study [32] suggests that non-
functional requirements are a root cause of bugs in most of the cases. Besides535
concerning with non-functional requirements, Developers in B also concern
with user requirements. In particular, they believe that manual testing (i.e.,
tests that verify which changes meet user requirements) is mandatory to avoid
26
bugs (A11: +1). This belief is contrary to a previous study [55] that discourages
manual testing since it is very expensive to be performed and turned into a540
bottleneck.
Disagree. Developers in B disagree that “experienced developers intro-
duce less bugs” (A28: -2), reinforcing previous studies [51, 56] that indicate more
experienced developers are usually the authors of bug-introducing changes be-
cause they perform more complex tasks. Similarly, Developers in B disagree545
that “developers working on their own code are less likely to introduce bugs”
(S33: -2). As discussed in the previous section, this disagreement reinforces a re-
cent study [51]. Developers in B also disagree that “code snippets frequently
changes are hot-spots of bugs” (A23: -1), indicating a perception contrary to
a recent study [57] that indicates more recently changed code have a higher550
possibility of containing bugs.
Neutral. Even though one might argue that code review and the use of
static analysis tools to avoid bugs, Developers in B have a neutral position
regarding these issues.
4.1.3. Viewpoint C: Have your own testing team555
Six out of 41 participants significantly loaded on this viewpoint, which corre-
sponds to 15% of all participants. Most of the participants loaded into viewpoint
C (83%) are from industry, and at least 50% of them have high experience in
software development, code review, and bug fixing.
Viewpoint interpretation: Having a testing team (A09) is mandatory to560
avoid bug-introducing changes. Regardless if a developer focuses on fixing bugs
(A18) or they are familiar with the programming language adopted in a project
(A18).
Agree. Developers loaded in this viewpoint (Developers in C) strongly
agree “Having a testing team is mandatory to avoid bug-introducing changes”565
(A09:+3), indicating that the Developers in C have perceptions contrary to
previous studies [55, 37] that shows a reduction of bugs when companies do
not adopt a testing team. They also believe familiarity with source code is
27
mandatory to avoid bug-introducing changes (A04:+2). This belief reinforces
a previous study [47] that indicates code snippets suffering interference from570
different developers are more difficult to be familiar with and, consequently,
developers tend to introduce more bugs. Moreover, Developers in C present a
concern by the typing (static or dynamic) supported by programming language.
They believe languages with static typing (e. g. C#) are less susceptible to bugs,
reinforcing the discussions of developers in specialized forums [53]. In a previous575
study [58], the authors indicate that one of the main benefits of agile methods
is to reduce the delivery lifecyle [58]. However, Developers in C believe it is
needed to be more careful about speeding up the software development process
since it may lead developers to introduce more bugs.
Disagree. However, Developers in C strongly disagree “fixings bugs is580
more likely to introduce bugs than adding new features” (A:29, -3). This belief is
contrary to a study [48] that indicates developers may introduce more bugs when
fixing existing ones than adding new features. Similarly, the developers also
disagree that familiarity with the programming language adopted in a project is
mandatory to avoid bugs (A18: -2). This belief is contrary to some discussions in585
specialized forums where participants argue a high influence of the programming
language in the reduction of bugs [53].
Neutral. Merge commits is a crucial moment along with software develop-
ment since different parts of the systems are integrated, and conflicts can arise.
Thus, one might expect that merge commits may lead developers to introduce590
more bugs than other commits. However, Developers in C have a neutral
position regarding this issue.
4.1.4. Viewpoint D: Be Aware of All Artifacts Impacted by Changes, mainly
in Geographically Distributed Teams
Three out of 41 participants significantly loaded on this viewpoint, which595
corresponds to 7% of all participants. All participants loaded into viewpoint D
are from industry, and at least 33% of them have high experience in software
development, bug reporting, and bug fixing.
28
Viewpoint interpretation: To avoid bugs, be aware of all artifacts im-
pacted by a change when reviewing it, mainly when we have geographically600
distributed teams (A38). It does not matter if code review (A19) and excep-
tions handling (A26) policies have been applied in the project.
Agree. Developers loaded in this viewpoint (Developers in D) strongly
agree it is needed to be aware of all artifacts impacted by a change when re-
viewing it to avoid the introduction of bugs (A6: +3). This belief reinforces605
existing studies [59] that use change impact analysis to localize bugs. The devel-
opers also agree that “distributed teams introduce more bugs than teams that are
not geographically distributed” (A38: +2). This belief complements the results
of a previous study [10] that indicates geographic distribution has a measur-
able effect on software quality, but it did not identify a consistent effect since610
sometimes it was good, and sometimes bad. Still regarding software quality,
Developers in D believe that “root-canal refactorings are less prone to intro-
duce bugs” (A31: +1), indicating a perception contrary to previous study [49]
that suggests refactorings tend to induce bugs very frequently.
Disagree. Although Bacchelli et al. [1] suggest that the main motivation615
for developers to review code is to find bugs, Developers in D disagree that
code reviewing reduces considerably the introduction of bugs (A19: -2). Simi-
larly, Developers in D disagree it is mandatory to handle exceptions in order
to avoid bugs (A26: -2), reinforcing previous studies [60] that suggests better
testing for exception handling bugs since they are ignored by developers less620
often than other bugs. Developers in D also disagree that changes breaking
the software architectural design are more likely to induce bugs (A14:-1). This
belief is contrary to existing studies that suggest when the software architec-
tural design is compromised by poor or hasted design choices, the architecture
is often subject to different architectural problems or anomalies, which may lead625
to bugs [61].
Neutral. Finally, it is known that the reuse of third-party APIs has several
benefits to software development, mainly in terms of time-consuming. On the
other hand, a study [62] also indicate that the fault-proneness of APIs has
29
been the main cause of application failures. The belief of Developers in D are630
contrary to these studies since they adopt a neutral view regarding the impact
of APIs on the introduction of bugs (A36: 0). Moreover, even though Cleland-
Huang et al. [32] suggests that changes neglecting non-functional requirements
are the root cause of bugs, Developers in D do not have a definitive view about
this issue (A2: 0).635
4.1.5. Viewpoint E: Be familiar with the code
Four out of 41 participants significantly loaded on this viewpoint, which
corresponds to 9% of all participants. All participants loaded into viewpoint E
are from industry, and at least 50% of them have high experience in software
development and bug fixing.640
Viewpoint interpretation: Familiarity with the source code is manda-
tory to avoid bug-introducing changes (A4;A33) regardless of the programming
language (A37).
Agree. Developers loaded into this viewpoint (Developers in E) strongly
agree that “familiarity of developers with source code is mandatory to avoid645
bug-introducing changes” (A4:+3), reinforcing existing studies [53, 56]. In [47],
the author suggests that the higher the developers’ experience in a code, the
more they know the code and, consequently, they introduce less bugs. On the
other hand, studies [51, 56] also suggest that developers tend to perform slightly
larger (and likely more complex) changes when they have higher experience650
on the code source and, consequently, they induce more bugs [56]. Similarly,
Developers in E also agree that ”developers working on their own code are less
likely to introduce bugs”, reinforcing a study [51] that indicates code maintained
by single developers are strongly associated with less bugs.
Disagree. We observe that Developers in E disagree that the introduc-655
tion of bugs depends on which programming language (A37:-1). On the other
hand, a previous study [52] indicates that typed languages, in general, are less
fault-proneness than the dynamic types.
Neutral. Previous studies [55] involving large companies (e. g., Google and
30
Facebook) discourage the adoption of testing team, Developers in E assume660
a neutral position regarding this issue (A09: 0).
Although Viewpoints B and E look similar, they highlight some differences.
While Viewpoint B involves software history, non-functional requirements, ex-
perience, and ownership, Viewpoint E is mainly related to code familiarity. One
might argue that the understanding of the software history may lead to code665
familiarity. But, software history and code familiarity still present differences.
For instance, developers can be familiar with the current code of the software,
but they may not know the software history.
Summary of RQ1.The viewpoints reveal groups of developers strongly believ-
ing that using automated testing (Viewpoint A), software history (Viewpoint B),
having a testing team (Viewpoint C), being aware of impacted artifacts (View-
point D), and being familiar with source code (Viewpoint E) are mandatory to
avoid bugs.
670
4.2. RQ2. What are the developers’ reasons to agree with an assumption?
Here we investigate the reasons for developers to agree (or disagree) with an
assumption. To answer this research question, we ask developers to comment
about their reasons to agree or disagree with a specific assumption. We trans-
lated and lightly edited some relevant comments to provide qualitative evidence675
to reflect the developers’ reasons. Table 5 summarizes the developers’ reasons
to agree or disagree with certain assumptions according to the viewpoints. The
first and second column describe the viewpoint and the assumptions associated
with it, respectively. The third column presents the developers’ reasons (com-
ments) to agree or disagree with the assumption. At last, the fourth column680
indicates if the developers agree or disagree with the assumption. The complete
list of comments is available in the accompanying web page [30].
Viewpoint A. As previously discussed, Developers in A believe that ex-
perienced developers (A28), automated testing (A10), high test coverage (A08),
and understanding the rationale behind changes (A01) may reduce the intro-685
31
Table 5: Developers’ Reasons in Viewpoints
Viewpoint Assumption Reason Agreement or
Disagreement
A
Experienced developers (A28) Learn from mistakes.
Agree
Automated testing (A10) Tests should be cheap and easy to run again and again.
High test coverage (A08) We can cover most of the corner cases.
Understanding the rationale
behind changes (A01)
If you do not understand what you are doing with code; The
chances to do something wrong are higher.
Working on their own code (A33) Tendency to be more sloppy; We are tempted to just test what
we are sure is not broken. Disagree
Having a testing team (A09) The same developer who wrote the code can write good enough
tests.
Understanding
software history (A24)
The history may show some past errors, but it by itself does not
decrease the chances of a bug
B
Applying manual tests (A11) They are important and complement automated tests.
Agree
Understanding
the software history(A24)
It is possible to identify the locations that are more likely to in-
troduce bugs.
Avoiding large changes (A05) It is harder to check every single feature added/removed.
Neglecting changes (A02) It is the main cause of anomalies at run-time.
Experienced developers (A28) They tend to work in critical modules.
DisagreeAvoiding code snippets
frequently changed (A23)
They are the most reviewed code in the system and, therefore, it
is less likely to remain bugs.
Developers Working on
their own code (A33)
They may not be aware of mistakes that can introduce bugs, which
may be recognized by external developers that may have a differ-
ent point of view.
C
Having a testing team (A09) They tend to be less biased.
Agree
Familiarity with the
source code (A04)
Developers have a better sense where to perform changes; without
introducing bugs.
Avoiding agile methods (A22) They produce pressure b ehind the developers, which may reduce
the software quality.
Using static typing languages (A40) Dynamic typing languages are more sensitive.
Familiarity with the
programming language (A18)
Many bugs are due to flaws in programming or business logic Disagree
Fixing bugs is riskier than
adding new features (A29)
During bug fixing, developers will focus on the primary objective
that is to prevent bugs.
D
Being aware of all artifacts
impacted by a change (A06)
Reviewers need to be able to identify any anomaly inserted in the
artifacts involved. Agree
Working on teams not
geographically distributed (A38)
Ability of the leader to supervise the team.
Handling exceptions (A26) Not all exceptions should be handled by developers (i.e., runtime
exceptions). Disagree
Reviewing code (A19) Reviewers are more focused on code quality.
EFamiliarity with source code (A04) More I knew about the system, better I was able to make (close
to) bug-free code.
Agree
Programming language is used (A37) Bugs exist in every language, it dep ends more on the skills of
developers.
Disagree
duction of bugs. Moreover, they also believe that code smells may lead to
bug-introducing changes (A15).
32
Developers in A comments provide some explanations regarding those be-
liefs. For instance, to explain why they agree with the assumption A28, De-
veloper #14 comments that “developers that have already have written buggy690
code and learned from their mistakes are less likely to introduce bugs ”. In
addition, we put some terms in bold as a reference to the comment. This way,
instead of describing the entire comment in Table 5, we use the text in bold to
explain why they agree with the assumption A28.
Regarding automated testing, developer #39 argues that “tests should be695
cheap and easy to run again and again on every single change”. Still re-
garding tests, developer #30 presents a reasonable explanation on the influence
of test coverage to reduce bug-introduce changes: I think if we test most of (or
all) the branches in a particular method with different inputs, we can cover
most of the corner cases.”. Finally, concerning rationale, developer #5 ar-700
gues that “If you do not understand what you are doing with code, the
chances to do something wrong are higher”.
Developers in A disagree that working on their own code (A33), having
a testing team (A09), or understanding the software history is mandatory to
avoid bugs (A24). Developer #10 argues that “developers tend to be more705
sloppy when working alone or on their own code ”. Developer #21 also explains
that “When we develop, we tend to get used to the code we produce, and we
are tempted to just test what we are sure is not broken. So, the chance
to introduce bug is higher”. Regarding having a testing team, developer #30
argues that “the same developer who wrote the code can write good710
enough tests to avoid bug-introducing changes”. Finally, developer #4 argues
that “the history may show some past errors, but it by itself does not
decrease the chances of a bug”.
Viewpoint B. Developers in B agree that understanding the software
history (A24), applying manual tests (A11), avoiding large changes (A05), and715
negligent changes (A02) may reduce the introduction of bugs.
Developer #34 argues that “understanding the software history, it is pos-
sible to identify the locations that are more likely to introduce bugs
33
and thereby avoid them”. Moreover, developer #3 argues that “Manual tests
are important and complement automated ones to avoid the introduction720
of bugs”. Regarding large changes, developer #30 argues that a “I think it
is harder to test all the changes (manually or automatically) if we have
a lot of them, thus the odds of introducing bugs are higher”. Still regarding
changes, developer #10 argues that “it’s harder to check every single fea-
ture added/removed and its effect on the other parts of the code.”. Finally,725
developer #3 argues that “neglecting of non-functional requirements are the
main cause of anomalies at run-time”.
Developers in B disagree that experienced developers (A28), avoiding code
snippets frequently changed (A23), and working on their own code (A33) may
avoid bugs. Developer #34 argues that “experienced developers tend to work730
in critical modules of the software, then they tend to introduce more bugs
than not experienced developers”. Regarding code snippets frequently changed,
Developer #11 explains that “code snippets frequently changed are the most
reviewed code in the system and, therefore, it is less likely to contain
bugs”. Finally, Developer #27 argues that “developers working on their own735
code may not be aware of mistakes that can introduce bugs, which
may be recognized by external developers that may have a different
point of view”.
Viewpoint C. Developers in C agree that having a testing team (A09),
familiarity with the source code (A04), avoiding agile methods (A22) and using740
static typing languages (A40) may reduce bug-introducing changes. Develop-
ers #17 argue that “the testing team tends to be less biased when testing
the source code since their members did not write the code ”. Regarding the
familiarity with the source code, developer #38 argues that “the familiarity of
the developers with the source code provides a better sense where they can745
change without introducing bugs”. Developer #16 argues that “shorter de-
livery cycles in the agile methods produce pressure behind the developers to
provide fast deliveries on the software development in the industry may re-
duce the software quality ”. The Developer#16 also argues that “dynamic
34
typing is more sensitive to introduce bugs by the developer, e.g., when values750
of different types are assigned to the same variable, the test cases require more
effort for the tester, and eventually, this type of error goes unnoticed”. There-
fore, more effort is required for the tester, and this type of error may remain
ignored.
Although Developers in C agree with the assumptions A4, A9, A22, and755
A40, they disagree that familiarity with the programming language (A18) may
reduce bugs, and to fix them is riskier than adding new features (A29). De-
veloper #16 explains that “independent of programming language, many bugs
are due to flaws in programming or business logic”. Regarding fixing
bugs or adding new features, Developer #11 argues that “ During bug fixing760
since they will focus on the primary objective that is to prevent bugs,
instead to deliver the feature that some bugs can be introduced”.
Viewpoint D. Developers in D agree that being aware of all artifacts im-
pacted by a change (A06) and working on teams not geographically distributed
(A38) may reduce bugs. Developer #24 argues that “reviewers need to be765
able to identify any anomaly inserted in the artifacts involved ”. Re-
garding geographically distributed teams, developer #19 argues that “When the
team leader is weak, even if it is geographically in the same place, there may
be a higher number of bugs introduced when compared to a team with proper
supervision that works geographically separate ”. However, the last case requires770
a lot more management effort. They also explain that “Teams with different
time zone are an obstacle for easy and effective communication ”.
Developers in D disagree that handling exceptions (A26) and reviewing
code (A19) may reduce bugs. Developer #24 argues that “Not all exceptions
should be handled by developers (i.e., runtime exceptions).”. Regarding775
code review, developer #26 argues that “During the code review, reviewers are
more focused on code quality (such as conventions, bad smells, architecture,
and tests) than checking if an algorithm is implemented correctly or if the change
will introduce some bugs ”.
Viewpoint E. Developers in E agree that familiarity with source code780
35
(A04) may reduce bugs. In particular, developer #36 argues that “When I
worked with unfamiliar systems, my overall feeling is that more I knew about
the system, better I was able to make (close to) bug-free code”.
On the other hand, Developers in E disagree that the introduction of bugs
depends on which programming language is used (A37). Developers #16, argue785
that “The introduction of bugs is independent of the programming language”,
while developer #40 complements this arguing that “Bugs exist in every
language, it depends more on the skills of developers than the language
itself.
Summary of RQ2.Developers rely on personal and technical reasons to agree
(or disagree) with some assumptions. Personal reasons involve skills, knowledge,
focus, ability to learn, awareness of the task being performed, neglect, bias, and
pressure. The technical ones include the adoption of cheap and easy tests to run,
critical modules, flaws in programming or business logic, programming language
sensitivity, and code coverage.
790
4.3. RQ3. Which assumptions are consensus/dissensus among developers’ view-
points?
Although developers have different viewpoints on the relevance of assump-
tions, it is expected they have some views in common regarding the relevance of
the assumptions. RQ3intends to investigate the assumptions that are consen-795
sus or dissensus among the developers’ viewpoints. In this section, we analyze
the assumptions that are consensus (A7, A17 and A37) or dissensus (A18, A24,
A25, A33 and A34) among the developers’ viewpoints, as described in Table 3.
For instance, [A33: -1*, -2**, 0, +1, +2*], which means that we are analyzing
assumption 33 and its rank is -1 in viewpoint A, while in viewpoint B its rank800
is -2, in viewpoint C is 0, in viewpoint D is +1 and in viewpoint E is +2.
4.3.1. Consensus Assumptions
Regardless of the different developers’ viewpoints, we observe three assump-
tions A7, A17, and A37 in consensus.
36
A7: Be careful with high number of features. In all the viewpoints805
(A-E), developers have a consensus regarding the assumption ”Changes invoking
a high number of features (methods, fields, classes) are more likely to introduce
bugs” [A7: +1, +3, +1, +2, +1]. The developers argue that the larger the
number of invoked features, the more difficult and complex it is to track the
impact of the changes. As a consequence, making it more likely to introduce810
bugs.
A17: Compile and run the system with the changes. In most of the
developers’ viewpoints, there is a strong agreement consensus with the assump-
tion ”Before accepting pull-requests, it is mandatory to compile and run the
system with the changes to avoid the introduction of bugs” [A17: 0, +3, +3,815
+3, +1]. Developers believe it is fundamental or even rather obvious the need
to compile and run the system before committing changes to avoid conflicts or
identify even the simplest of the bugs.
A37: The programming language does not matter. Differently from
previous assumptions, developers present a disagreement consensus. Most of820
the developers strongly disagree that ”Introduction of bugs depends on which
programming language is used” [A37: -3, -3, -3, -3, -1]. Developers argue that
bugs exist in every programming language, and the introduction of bugs depends
more on the knowledge or skills of a developer than the programming language
itself.825
4.3.2. Dissensus Assumptions
There is no concordance among the developers’ viewpoints in the assump-
tions A24, A25, A33, and A34.
A24: Understanding software history. Similar to the previous assump-
tion, the developer’s viewpoints have different beliefs among this assumption.830
They have dissensions upon: ”Understanding the software history is mandatory
for developers to avoid the introduction of bugs” [A24: -3, +2, -1, 0, -1]. While
developers loaded in Viewpoint B agree with this assumption, the developers
loaded in Viewpoint A, C, and E disagree. Developers in B argue that when
37
understanding a software history, it is possible to avoid hot-spots that are the835
more prone to introduce bugs. On the other hand, Developers in A, C, and E
argue that there is no relation between software history and bugs. They also
comment that even though the understanding of the software history may show
some errors, by itself it does not decrease the introduction of bugs.
A25: Code reuse. Regarding the assumptions “Code reuse avoids the840
introduction of bugs” [A25: +1, -2, +2, +1, -3], while developers loaded in
Viewpoints A, C, and D agree with it, developers loaded in the remaining view-
points disagree with this assumption. Developers in A, C, and D argue that
reused code is already highly tested and mature, then it is more ”safer” than
writing new code (and supposedly bug-free). On the other hand, Developers in845
B and E comment that reusing code does not mean the code is correct. The
intuition behind this comments relies on the fact that “copy + paste” from
StackOverflow is not exactly a good practice since the code may not be com-
plete or correct. Moreover, those developers also explain that when reusing code
containing bugs or smells, you are only spreading them.850
A33: Code ownership. The developer’s viewpoints have dissensus that
”Developers working on their own code are less likely to introduce bugs” [A33:
-1, -2, 0, +1, +2]. Developers in D and E believe that code ownership is an
important aspect to prevent the introduction of bugs since it is expected that
developers who have worked in the code have better control and knowledge855
about it. On the other hand, Developers in A and B do not believe in code
ownership. They argue developers tend to be more sloppy, and, consequently,
may not be aware of their own mistakes during the software development.
A34: Awareness of merge commits. Similarly to the previous assump-
tion, the developer’s viewpoints have dissensus that ”Merge commits introduce860
more bugs than other commits” [A34:-2, +1, 0, +2, -2]. Developers in A and
E disagree that merge commits are more likely to introduce bugs. They argue
that merge commits are generally fairly straightforward and more accessible to
resolve them. On the other hand, Developers in B and D agree with the as-
sumption A34. They describe three main reasons: (i) usually, the developers865
38
that perform the merge are not the same that created the conflict; (ii) merge
commits are harder to inspect than other commits since merge commits involve
more files to be inspected; and (iii) a merge is the result of development, and
bugs can already be introduced before.
870
Summary of RQ3.Assumptions involving the number of features, pro-
gramming language, system compilation and execution produce more consensus
among developers than assumptions involving software history, merge commits,
code reuse, and ownership.
5. Implications
We discuss the implications of our study for researchers and practitioners.
5.1. Practitioners
Our study revealed developers’ viewpoints regarding a reduced number of875
assumptions (RQ1). Based on these viewpoints, practitioners can adopt or
rethink certain development, code review, or project management practices. For
instance, Viewpoint D reveals the importance of developers being aware of all
artifacts impacted by changes. This viewpoint raises an alert for developers to be
more careful in the artifacts impacted by new changes. Both viewpoints A and B880
reveal the importance of the developers’ experience. These viewpoints indicate
that code reviewers should consider the developers’ experience while reviewing
their commits. Viewpoint C reveals the importance of having a testing team in
the project. This viewpoint suggests that project managers should consider a
testing team while defining the organization of the project teams.885
Our study also indicates that developers rely on technical and personal rea-
sons to agree or disagree with the assumptions. Thus, companies should provide
ways to improve not only technical but also personal aspects of the team. For
example, companies could provide education opportunities for developers to
improve their skills, ability to learn and deal with pressure. Project managers890
should pay more attention to the personal aspects while managing the project
39
team. Reviewers should consider more carefully personal aspects while doing
their reviews.
At last, our study indicates that the assumptions A7, A17, and A37 are
consensus among developers. While the developers agree with the A7 (Be careful895
with high number of features) and A17 (Compile and run the system with the
changes), the developers disagree with the assumption A37 (The programming
language does not matter). The consensus about these assumptions suggests
that practitioners (mainly developers and reviewers) should be more careful
with changes invoking a high number of features (A7). Also, practitioners should900
compile and run the system with the changes before accepting them (A17). On
the other hand, the consensus suggests that practitioners should not put much
relevance to the programming language (A37).
5.2. Researchers.
Besides the practical contributions of our work to the development process,905
researchers can also benefit from our results. First, researchers can further
explore developers’ viewpoints (RQ1) aiming at complementing existing empir-
ical evidence. For instance, consider Viewpoint A that indicates “no matter the
software history, only tests, and developers’ experience are important”. This
viewpoint leads researchers to concomitantly investigate the group of assump-910
tions that compose this viewpoint, not separately as previous studies [63, 64, 65].
Indeed, researchers can investigate all the viewpoints revealed in the study to
analyze the benefits and consequences of each viewpoint in the development,
review, and project management practices.
Furthermore, additional questions can be raised when the relevance that de-915
velopers put in some assumptions are contrary to previous studies, as described
in RQ1, for instance: (i) developers might not know the results of previous stud-
ies. As a consequence, developers put more/less relevance to the assumptions
based on their personal experience [66] without considering existing studies.
This lack of knowledge of the developers indicates that researchers should make920
the results of the studies better known in a way that developers can have percep-
40
tions based on empirical evidence; (ii) Even though developers should consider
empirical evidence, it can be questioned. Indeed, some studies might need ad-
ditional data and/or experiments to support statistical significance [67]. Thus,
developers may decide not to follow the findings of existing studies because they925
might not be convinced with them.
Third, the developers’ comments described in RQ2raise some points that
can lead to further investigation. For example, developer #30 argues that a
developer is able to write code and test it and, thus, it is not necessary to have
a testing team. In this case: Is a developer really able to write and test his/her930
own code? In which circumstances is a developer able to write and test your
own code?
Last, the dissensus assumptions reflect divergent perceptions of developers
about a specific assumption (RQ3). Researchers could further investigate the
different perspectives of the developers, i.e., the ones who agree or disagree935
with an assumption, aiming at analyzing the subjectivity involved in the factor
that composes the assumption. For instance, consider the assumption A25
about code reuse in which there is a divergence among the developers. In this
case, researchers could empirically investigate the reasons why some developers
agree or disagree with this assumption, aiming to clarify which circumstance940
developers should dedicate more/less attention to code reuse.
6. Threats to Validity
6.1. Construct Validity
The Q-Methodology is commonly misinterpreted as a survey, but they are
very different methods. Probably, the key distinction is that Q-Methodology945
aims at discovering what are the diverse viewpoints about a topic under study,
in opposition to a survey that aims at having a sample allegedly representing
the proportion of those viewpoints in a given population [6]. As a consequence,
Q-Methodology studies neither need large nor random samples. Also, surveys
can lead to biases in responses, e.g., some participants might be more favorable950
41
than others. On the other side, the Q-sort structure may follow a quasi-normal
distribution. So, the participants are required to prioritize their perceptions
instead of being mostly positive, neutral, or negative.
One might consider that 41 assumptions may not be enough to support a
broad understanding of the developers’ viewpoints regarding bug-introducing955
changes. However, we select assumptions involving a diversity of factors in the
software engineering field extracted from diverse sources. In general, there are
40 to 80 assumptions used in Q-Methodology studies [13, 6]. Still regarding the
assumptions, they may present ambiguity and similarity among them. Three
professionals followed the guideline adopted in a previous study [20] to identify960
ambiguous and similar assumptions. As a result of this analysis, we remove
ambiguous assumptions and consider only one among the similar ones. Another
threat is related to the number of participants (P-Set) involved in our study.
Although the number of participants involved in our study can be consid-
ered small for some conventional analysis, Q-Methodology avoids the ”magic965
number” in a certain sense because it studies qualitative differences, on which
quantity has no effect. Indeed, the Q-Methodology is most effective when ap-
plied in groups to make sure that certain viewpoints are included based upon
the research question than a vast group of participants [12]. Moreover, our P-
Set is in accordance with our previous work [6] that analyzes the developers’970
viewpoints regarding systematic reviews. Even aware of these facts, we mitigate
this threat by involving developers with practical experience in issues related to
bugs.
Even so, one might argue that a quasi-normal distribution is not appropri-
ated to capture the participant’s perceptions regarding the assumptions asked975
(e.g., one participant could have a more overall negative perception regarding
code complexity). Therefore, this participant’s perception (i.e., the Q-Sort)
would more likely follow a negatively skewed curve. Although we concur, we
believe the mitigation plan here is to have a comprehensive set of assumptions
that cover a wide range of perceptions. Therefore, it could be hardly the case of980
participants do not agree, or agreeing to every assumption since many of them
42
are nearly the opposite.
6.2. Conclusion validity
To perform our study, we use techniques, such as Pearson’s correlation, PCA,
Guttman Rule, and cumulative explained variance, which have a high influence985
in the findings of our study. However, we select those techniques by considering
mathematically and optimal choices instead of subjectivity measures. Moreover,
those techniques have been used by existing studies [6] involving Q-Methodology.
To mitigate this threat, we made our dataset with the participants answers
(Q-Sorts) available in the accompanying web page [30]. Researchers can thus990
explore this dataset in different ways.
Another threat to be considered is the reduction of viewpoints analyzed.
Initially, we obtain eight viewpoints, but we analyze only five. In this case, we
count with Q-Methodology that supports us using qualitative criteria to make
such reduction, as described in Section 3. Researchers interested in investigating995
other solutions can analyze all the viewpoints available in the accompanying web
page [30].
One might argue that some assumptions may generate disagreement solely
because the study did not invest the effort to produce a common understanding
of the assumptions between participants. We could have discussed each assump-1000
tion with the participants to produce a common understanding among them.
However, those discussions could have skewed the developers’ perceptions. Our
goal was to explore the developers’ perceptions according to their own beliefs,
knowledge, and experience.
6.3. Internal validity1005
Each developer is responsible for ranking 41 assumptions related to a vari-
ety of factors. This number of assumptions may imply that some developers
rank some assumptions without paying the required attention. To mitigate this
threat, we allow the developers to interrupt and resume their analysis. Thus,
they could do a subset of the analysis up to the point they feel tired. Later,1010
they could resume the analysis when they feel prepared to continue.
43
6.4. External Validity
Although we have identified different viewpoints, they may not reflect the
developers’ beliefs in a broader way. Indeed, the majority of the participants
are from South America, so we can not conclude that our results represent the1015
perceptions of developers spread around the world. However, Q-Methodology
does not attempt to make a claim of universal relevance or to represent the
views of a larger sample [26]. Also, developers’ comments to explain why they
agree/disagree with an assumption represent complementary evidence, requiring
further investigation to understand the developers’ reasons to agree or not with1020
an assumption.
7. Related Work
Previous studies have investigated about the developers’ beliefs, perceptions,
personality, and assumptions regarding different factors in software engineering.
For instance, Meyer et al. [7] investigated the developers’ perceptions about pro-1025
ductive and unproductive work through a survey and observational study. The
results indicated that interruptions or context switches impact productivity sig-
nificantly. In another study, they investigated the variation in productivity
perceptions based on a survey with 413 developers [8]. They also identified
six groups of developers with similar perceptions of productivity regarding the1030
impact of aspects: social, lone, focused, balanced, leading, and goal-oriented.
Similarly to our study, Meyer et al. [7, 8] aim at identifying common percep-
tions among developers. However, they focused on aspects that may impact
productivity instead of factors involved during the software development to
avoid bug-introducing changes. In addition, they did not focus on identifying1035
consensus/dissensus among developers’ viewpoints.
Smith et al. [9] investigated the relations between work practices, beliefs, and
personality traits, involving 797 software engineers of Microsoft. The results
suggest some differences between engineers, developers and, managers in terms
of five personality domains (named OCEAN - openness, conscientiousness, ex-1040
44
traversion, agreeableness, and neuroticism). For instance, developers who chose
to build tools were more open, conscientious, extraverted, and less neurotic. En-
gineers who agree with the assumption “Agile development is awesome” were
more extroverted and less neurotic. Managers were more conscientious and
extraverted. On the other hand, there was no personality difference between1045
developers and testers, introducing some questions to previous research. They
presented interesting insights regarding developers’ beliefs, work practices, and
personality. However, the authors did not explore comparative results with the
empirical evidence of these findings.
In a recent study, Matthies et al. [11] investigated the perceptions of agile de-1050
velopment practices and their usage in Scrum teams. The authors collected per-
ceptions of 42 students through surveys. The results indicated that the Scrum
methodology impacts the students’ views of employed development practices.
In fact, most of the students attributed to the agile manifest the main reason to
use the version control system according to agile ideas. The main limitation of1055
this work is related to the collected perceptions only from students of a software
engineering course. As a consequence, those perceptions may not reflect the
viewpoints of professional developers.
Wan et al. [66] investigated the expectations, behavior and thinking of practi-
tioners about the defect prediction in practice. The authors performed a mixed1060
study (qualitative and quantitative), collecting statements about defect pre-
diction from literature and open-ended interviews. The results indicated that
90% of the participants are interested in adopting defect prediction techniques.
Regarding the practitioners’ perceptions on factors related to defect prediction,
the results indicated a disconnection between empirical evidence in prior studies1065
and practitioners’ perceptions. Our study goes beyond the statement analysis
by following the Q-methodology to extract the viewpoints. This way, we could
obtain insights not only about developers’ beliefs on statements but also about
beliefs of groups of developers on sets of statements.
In [10], the authors investigated the developers’ beliefs in empirical software1070
engineering. They gathered a set of claims, mostly from literature, aiming to
45
investigate the accordance between the empirical evidence and developers’ be-
liefs. The results indicated that belief is mainly based on personal experience,
rather than on findings in empirical research. The results also suggested that
actual evidence in a project has not a strict relation with the beliefs. The beliefs1075
collected in this study serve as a basis for our study. While they investigated
developers’ beliefs regarding empirical software engineering evidence, our study
focused on analyzing the developers’ beliefs to avoid bug-introducing changes
during the software development. To do that, we edited some of the beliefs col-
lected in their study [10] aiming to focus on bug-introducing changes. Moreover,1080
we extended the number of beliefs to be analyzed, involving 41 beliefs instead
of only 16 as described by the authors [10].
The above mentioned are concerned with viewpoints of a group of people –
usually developers – regarding diverse topics, but none of these studies focused
on analyzing developers’ viewpoints on avoiding bug-introducing changes. Be-1085
sides, the majority of them applied either qualitative (surveys) and quantitative
approaches to explore subjectivity. Brown [26] critiques this dichotomy, ad-
vocating in favor of Q-Methodology as a possible alternative. In our recent
study [6], we used the Q-Methodology that transcends this argument because Q
is neither entirely qualitative or fully quantitative. By combining the strengths1090
of both qualitative and quantitative research, Q-Methodology allows for the si-
multaneous study of objective and subjective issues to determine an individual’s
viewpoints [16].
8. Conclusions
We presented a study aimed at investigating the developers’ perceptions1095
on assumptions to avoid bug-introducing changes. In particular, we analyzed
the relevance and the main reasons for developers to agree or disagree with
some assumptions and which assumptions are consensus/dissensus among the
developers. We involved 41 developers that analyzed 41 assumptions involving
a diversity of factors.1100
46
We identified five main developers’ viewpoints. Some developers agreed
that using automated testing (Viewpoint A), understanding the software history
(Viewpoint B), having a testing team (Viewpoint C), being aware of impacted
artifacts (Viewpoint D), and being familiar with source code (Viewpoint E)
are mandatory to avoid bugs. On the other hand, other developers strongly1105
disagreed with the relevance of understanding the software history to avoid
the introduction of bugs (Viewpoint A). After analyzing the developers’ com-
ments, we observed that developers rely on personal and technical reasons to
agree/disagree with some assumptions.
47
References1110
[1] A. Bacchelli, C. Bird, Expectations, outcomes, and challenges of modern
code review, in: Proceedings of the 2013 International Conference on Soft-
ware Engineering, 2013, pp. 712–721.
[2] L. Pascarella, D. Spadini, F. Palomba, M. Bruntink, A. Bacchelli, Infor-
mation needs in contemporary code review, Proc. ACM Hum.-Comput.1115
Interact. 2 (CSCW). doi:10.1145/3274404.
[3] J. ´
Sliwerski, T. Zimmermann, A. Zeller, When do changes induce fixes?,
SIGSOFT Softw. Eng. Notes 30 (4) (2005) 1–5. doi:10.1145/1082983.
1083147.
[4] N. C. Borle, M. Feghhi, E. Stroulia, R. Greiner, A. Hindle, Analyzing the1120
effects of test driven development in github, in: Proceedings of the 40th
International Conference on Software Engineering, 2018, pp. 1062–1062.
doi:10.1145/3180155.3182535.
[5] L. MacLeod, M. Greiler, M. Storey, C. Bird, J. Czerwonka, Code reviewing
in the trenches: Challenges and best practices, IEEE Software 35 (4) (2018)1125
34–42. doi:10.1109/MS.2017.265100500.
[6] B. Cartaxo, G. Pinto, S. Soares, Rapid Reviews in Software Engineer-
ing, Springer International Publishing, 2020, pp. 357–384. doi:10.1007/
978-3-030-32489-6_13.
[7] A. N. Meyer, T. Fritz, G. C. Murphy, T. Zimmermann, Software developers’1130
perceptions of productivity, in: Proceedings of the 22nd ACM SIGSOFT
International Symposium on Foundations of Software Engineering, 2014,
pp. 19–29. doi:10.1145/2635868.2635892.
[8] A. N. Meyer, T. Zimmermann, T. Fritz, Characterizing software develop-
ers by perceptions of productivity, in: Proceedings of the 11th ACM/IEEE1135
International Symposium on Empirical Software Engineering and Measure-
ment, 2017, pp. 105–110. doi:10.1109/ESEM.2017.17.
48
[9] E. K. Smith, C. Bird, T. Zimmermann, Beliefs, practices, and personalities
of software engineers: A survey in a large software company, in: Proceed-
ings of the 9th International Workshop on Cooperative and Human Aspects1140
of Software Engineering, 2016, pp. 15–18. doi:10.1145/2897586.2897596.
[10] P. Devanbu, T. Zimmermann, C. Bird, Belief & evidence in empirical soft-
ware engineering, in: Proceedings of the 38th International Conference on
Software Engineering, 2016, pp. 108–119. doi:10.1145/2884781.2884812.
[11] C. Matthies, J. Huegle, T. D¨urschmid, R. Teusner, Attitudes, beliefs,1145
and development data concerning agile software development practices,
in: Proceedings of the 41st International Conference on Software Engi-
neering: Software Engineering Education and Training, 2019, pp. 158–169.
doi:10.1109/ICSE-SEET.2019.00025.
[12] S. R. Brown, Political subjectivity: Applications of Q methodology in po-1150
litical science, Yale University Press, 1980.
[13] S. Watts, P. Stenner, Doing Q Methodological Research: Theory,
Method and Interpretation, Sage Publications, 2012. doi:10.4135/
9781446251911.
[14] J. M. Lee, H. J. Kim, J. Y. Rha, Shopping for society? Consumers’ value1155
conflicts in socially responsible consumption affected by retail regulation,
Sustainability (Switzerland) 9 (11). doi:10.3390/su9111968.
[15] M. A. Pereira, J. R. Fairweather, K. B. Woodford, P. L. Nuthall, Assess-
ing the diversity of values and goals amongst Brazilian commercial-scale
progressive beef farmers using Q-methodology, Agricultural Systems 1441160
(2016) 1–8. doi:10.1016/j.agsy.2016.01.004.
[16] R. Cross, Exploring attitudes: The case for q methodology, Health educa-
tion research 20 (2005) 206–13. doi:10.1093/her/cyg121.
[17] W. Stephenson, Correlating persons instead of tests, Journal of Personality
4 (1) (1935) 17–24. doi:10.1111/j.1467-6494.1935.tb02022.x.1165
49
[18] Y. Yang, A brief introduction to q methodology, Int. J. Adult Vocat. Educ.
Technol. 7 (2) (2016) 42–53. doi:10.4018/IJAVET.2016040104.
[19] A. Zabala, U. Pascual, Y. Xia, Bootstrapping q methodology to improve
the understanding of human perspectives, in: PloS one, 2016, pp. 1–19.
[20] J. Paige, K. Morin, Q-sample construction: A critical step for a q-1170
methodological study, Western journal of nursing research 38. doi:10.
1177/0193945914545177.
[21] K. Pearson, Note on regression and inheritance in the case of two parents,
Proceedings of the Royal Society of London 58 (1895) 240–242.
URL http://www.jstor.org/stable/1157941175
[22] C. Spearman, The proof and measurement of association between two
things, The American Journal of Psychology 15 (1) (1904) 72–101.
URL http://www.jstor.org/stable/1412159
[23] S. Ramlo, Centroid and theoretical rotation: Justification for their use in
q methodology research., Mid-Western Educational Researcher 28 (1).1180
[24] R. Larsen, R. T. Warne, Estimating confidence intervals for eigenvalues
in exploratory factor analysis, Behavior Research Methods 42 (3) (2010)
871–876. doi:10.3758/BRM.42.3.871.
[25] L. Guttman, Some necessary conditions for common-factor analysis, Psy-
chometrika 19 (2) (1954) 149–161. doi:10.1007/BF02289162.1185
[26] S. R. Brown, A primer on q methodology, Operant subjectivity 16 (3–4)
(1993) 91–138.
[27] R. J. Larsen, M. L. Marx, An introduction to mathematical statistics and
its applications, 3rd Edition, Prentice Hall, 2001.
[28] A. Zabala, Qmethod: A package to explore human perspectives using q1190
methodology, The R Journal 6 (2) (2014) 163–173.
50
[29] N. Sae-Lim, S. Hayashi, M. Saeki, How do developers select and prioritize
code smells? a preliminary study, in: 2017 IEEE International Conference
on Software Maintenance and Evolution (ICSME), 2017, pp. 484–488. doi:
10.1109/ICSME.2017.66.1195
[30] J. Souza, Complementary material (2020).
URL https://r4phael.github.io/dev_bic/
[31] Y. Tao, Y. Dang, T. Xie, D. Zhang, S. Kim, How do software engi-
neers understand code changes?: An exploratory study in industry, in:
Proceedings of the ACM SIGSOFT 20th International Symposium on1200
the Foundations of Software Engineering, 2012, pp. 51:1–51:11. doi:
10.1145/2393596.2393656.
[32] J. Cleland-Huang, R. Settimi, O. BenKhadra, E. Berezhanskaya,
S. Christina, Goal-centric traceability for managing non-functional require-
ments, in: Proceedings of the 27th International Conference on Software1205
Engineering, 2005, pp. 362–371. doi:10.1145/1062455.1062525.
[33] T. Fritz, G. Murphy, E. Murphy-Hill, J. Ou, E. Hill, Degree-of-knowledge:
Modeling a developer’s knowledge of code, ACM Transactions on Software
Engineering and Methodology (TOSEM) 23. doi:10.1145/2512207.
[34] O. Kononenko, O. Baysal, L. Guerrouj, Y. Cao, M. W. Godfrey, Investi-1210
gating code review quality: Do people and participation matter?, in: 2015
IEEE International Conference on Software Maintenance and Evolution
(ICSME), 2015, pp. 111–120. doi:10.1109/ICSM.2015.7332457.
[35] A. Demuth, R. Kretschmer, A. Egyed, D. Maes, Introducing traceabil-
ity and consistency checking for change impact analysis across engineering1215
tools in an automation solution company: An experience report, in: 2016
IEEE International Conference on Software Maintenance and Evolution
(ICSME), 2016, pp. 529–538. doi:10.1109/ICSME.2016.50.
51
[36] T. Dao, L. Zhang, N. Meng, How does execution information help with
information-retrieval based bug localization?, in: 2017 IEEE/ACM 25th1220
International Conference on Program Comprehension (ICPC), 2017, pp.
241–250. doi:10.1109/ICPC.2017.29.
[37] L. Prechelt, H. Schmeisky, F. Zieris, Quality experience: A grounded theory
of successful agile projects without dedicated testers, in: 2016 IEEE/ACM
38th International Conference on Software Engineering (ICSE), 2016, pp.1225
1017–1027. doi:10.1145/2884781.2884789.
[38] C. Boyapati, S. Khurshid, D. Marinov, Korat: Automated testing based on
java predicates, in: Proceedings of the 2002 ACM SIGSOFT International
Symposium on Software Testing and Analysis, 2002, pp. 123–133. doi:
10.1145/566172.566191.1230
[39] L. Sousa, A. Oliveira, W. Oizumi, S. Barbosa, A. Garcia, J. Lee, M. Kali-
nowski, R. de Mello, B. Fonseca, R. Oliveira, C. Lucena, R. Paes, Identi-
fying design problems in the source code: A grounded theory, in: Proceed-
ings of the 40th International Conference on Software Engineering, 2018,
pp. 921–931. doi:10.1145/3180155.3180239.1235
[40] G. A. Oliva, M. A. Gerosa, Change coupling between software arti-
facts: Learning from past changes, in: C. Bird, T. Menzies, T. Zim-
mermann (Eds.), The Art and Science of Analyzing Software Data, Mor-
gan Kaufmann, 2015, pp. 285 – 323. doi:https://doi.org/10.1016/
B978-0-12-411519-4.00011-2.1240
[41] R. Hoda, J. Noble, Becoming agile: A grounded theory of agile transi-
tions in practice, in: Proceedings of the 39th International Conference on
Software Engineering, 2017, pp. 141–151. doi:10.1109/ICSE.2017.21.
[42] S. McIntosh, Y. Kamei, Are fix-inducing changes a moving target? a lon-
gitudinal case study of just-in-time defect prediction, IEEE Transactions1245
on Software Engineering 44 (5) (2018) 412–428. doi:10.1109/TSE.2017.
2693980.
52
[43] T. Zimmermann, N. Nagappan, A. Zeller, Predicting Bugs from His-
tory, Springer Berlin Heidelberg, 2008, pp. 69–88. doi:10.1007/
978-3-540-76440-3_4.1250
[44] Z. Li, S. Lu, S. Myagmar, Y. Zhou, Cp-miner: finding copy-paste and
related bugs in large-scale software code, IEEE Transactions on Software
Engineering 32 (3) (2006) 176–192. doi:10.1109/TSE.2006.28.
[45] F. Ebert, F. Castor, A study on developers’ perceptions about exception
handling bugs, in: 2013 IEEE International Conference on Software Main-1255
tenance, 2013, pp. 448–451. doi:10.1109/ICSM.2013.69.
[46] B. Johnson, Y. Song, E. Murphy-Hill, R. Bowdidge, Why don’t software
developers use static analysis tools to find bugs?, in: Proceedings of the
2013 International Conference on Software Engineering, 2013, pp. 672–681.
URL http://dl.acm.org/citation.cfm?id=2486788.24868771260
[47] F. Rahman, P. Devanbu, Ownership, experience and defects: A fine-grained
study of authorship, in: Proceedings of the 33rd International Confer-
ence on Software Engineering, 2011, pp. 491–500. doi:10.1145/1985793.
1985860.
[48] H. Zhang, L. Gong, S. Versteeg, Predicting bug-fixing time: An em-1265
pirical study of commercial software projects, in: 2013 35th Interna-
tional Conference on Software Engineering (ICSE), 2013, pp. 1042–1051.
doi:10.1109/ICSE.2013.6606654.
[49] G. Bavota, B. De Carluccio, A. De Lucia, M. Di Penta, R. Oliveto,
O. Strollo, When does a refactoring induce bugs? an empirical study,1270
in: Proceedings of the 2012 IEEE 12th International Working Confer-
ence on Source Code Analysis and Manipulation, 2012, pp. 104–113.
doi:10.1109/SCAM.2012.20.
[50] O. Meqdadi, N. Alhindawi, M. L. Collard, J. I. Maletic, Towards under-
standing large-scale adaptive changes from version histories, in: Proceed-1275
53
ings of the 2013 IEEE International Conference on Software Maintenance,
2013, pp. 416–419. doi:10.1109/ICSM.2013.61.
[51] M. Tufano, G. Bavota, D. Poshyvanyk, M. Di Penta, R. Oliveto, A. De Lu-
cia, An empirical study on developer-related factors characterizing fix-
inducing commits, Journal of Software: Evolution and Process 29 (1) (2017)1280
e1797, e1797 JSME-15-0185.R2. doi:10.1002/smr.1797.
[52] B. Ray, D. Posnett, P. Devanbu, V. Filkov, A large-scale study of program-
ming languages and code quality in github, Commun. ACM 60 (10) (2017)
91–100. doi:10.1145/3126905.
URL https://doi.org/10.1145/31269051285
[53] B. Ray, D. Posnett, P. Devanbu, V. Filkov, A large-scale study of program-
ming languages and code quality in github, Commun. ACM 60 (10) (2017)
91–100. doi:10.1145/3126905.
[54] A. E. Hassan, R. C. Holt, The top ten list: dynamic fault prediction, in:
21st IEEE International Conference on Software Maintenance (ICSM’05),1290
2005, pp. 263–272. doi:10.1109/ICSM.2005.91.
[55] J. A. Whittaker, J. Arbon, J. Carollo, How Google Tests Software, 1st
Edition, Addison-Wesley Professional, 2012.
[56] A. Mockus, Organizational volatility and its effects on software defects,
in: Proceedings of the 18th ACM SIGSOFT International Symposium on1295
Foundations of Software Engineering, 2010, pp. 117–126. doi:10.1145/
1882291.1882311.
[57] M. Mondal, C. K. Roy, K. A. Schneider, Identifying code clones having high
possibilities of containing bugs, in: Proceedings of the 25th International
Conference on Program Comprehension, 2017, pp. 99–109. doi:10.1109/1300
ICPC.2017.31.
[58] Ming Huo, J. Verner, Liming Zhu, M. A. Babar, Software quality and
agile methods, in: Proceedings of the 28th Annual International Computer
54
Software and Applications Conference, 2004. COMPSAC 2004., 2004, pp.
520–525 vol.1. doi:10.1109/CMPSAC.2004.1342889.1305
[59] A. T. Misirli, E. Shihab, Y. Kamei, Studying high impact fix-inducing
changes, Empirical Software Engineering 21 (2) (2016) 605–641. doi:10.
1007/s10664-015-9370-z.
[60] J. Oliveira, D. Borges, T. Silva, N. Cacho, F. Castor, Do android developers
neglect error handling? a maintenance-centric study on the relationship1310
between android abstractions and uncaught exceptions, Journal of Systems
and Software 136 (2018) 1 – 18. doi:https://doi.org/10.1016/j.jss.
2017.10.032.
[61] F. A. Fontana, R. Roveda, S. Vittori, A. Metelli, S. Saldarini, F. Mazzei,
On evaluating the impact of the refactoring of architectural problems on1315
software quality, in: Proceedings of the Scientific Workshop Proceedings of
XP2016, 2016, pp. 21:1–21:8. doi:10.1145/2962695.2962716.
[62] M. Linares-V´asquez, G. Bavota, C. Bernal-C´ardenas, M. Di Penta,
R. Oliveto, D. Poshyvanyk, Api change and fault proneness: A threat
to the success of android apps, in: Proceedings of the 2013 9th Joint1320
Meeting on Foundations of Software Engineering, 2013, pp. 477–487.
doi:10.1145/2491411.2491428.
[63] H. Hata, O. Mizuno, T. Kikuno, Bug prediction based on fine-grained mod-
ule histories, in: Proceedings of the 34th International Conference on Soft-
ware Engineering, 2012, pp. 200–210.1325
[64] C. Couto, P. Pires, M. T. Valente, R. S. Bigonha, N. Anquetil, Predicting
software defects with causality tests, Journal of Systems and Software 93
(2014) 24 – 41. doi:https://doi.org/10.1016/j.jss.2014.01.033.
[65] S. Kim, E. J. Whitehead, Jr., Y. Zhang, Classifying software changes: Clean
or buggy?, IEEE Transactions on Software Engineering 34 (2) (2008) 181–1330
196. doi:10.1109/TSE.2007.70773.
55
[66] Z. Wan, X. Xia, A. E. Hassan, D. Lo, J. Yin, X. Yang, Perceptions, expec-
tations, and challenges in defect prediction, IEEE Transactions on Software
Engineering 46 (11) (2020) 1241–1266. doi:10.1109/TSE.2018.2877678.
[67] P. S. Kochhar, D. Lo, J. Lawall, N. Nagappan, Code coverage and postre-1335
lease defects: A large-scale study on open source projects, IEEE Trans-
actions on Reliability 66 (4) (2017) 1213–1228. doi:10.1109/TR.2017.
2727062.
56
... Concretely, we aim to improve our understanding of bug-introducing changes, i.e., such changes to software in which a new bug is introduced into the code due to a coding mistake. 1 Bug-introducing changes have been intensively studied, e.g., as part of the defect prediction literature that tries to predict bugintroducing changes [9]. Beyond the empirical study, the literature also provides a good framework that describes how bugs are introduced [13] as well as regarding developer viewpoints on avoiding bugs [15]. There is also knowledge about the coverage of buggy code by software tests [4], the correlation between test-driven development, code coverage, and subsequent bug fixes [2], the impact of code reviews [10], and practices used during debugging [1,11]. ...
... In a recent work, Souza et al. [15] studied the topic of bug-introduction based on the perception of developers. In their work, they conducted a mixed-methods approach based on surveying developer opinions about specific practices. ...
... Second, our study does not involve the developers, but only considers the data produced by them. However, since there already is prior work on developer perceptions (e.g., [15]), we believe that our work complements this perspective. ...
Preprint
Context: Many studies consider the relation between individual aspects and bug-introduction, e.g., software testing and code review. Due to the design of the studies the results are usually only about correlations as interactions or interventions are not considered. Objective: Within this study, we want to narrow this gap and provide a broad empirical view on aspects of software development and their relation to bug-introducing changes. Method: We consider the bugs, the type of work when the bug was introduced, aspects of the build process, code review, software tests, and any other discussion related to the bug that we can identify. We use a qualitative approach that first describes variables of the development process and then groups the variables based on their relations. From these groups, we can induce how their (pair-wise) interactions affect bug-introducing changes.
... They found that Task Relevance and Smell Severity were most commonly considered during code smell selection, while Module Importance is employed most often for code smell selection. Souza et al. (30) investigated the developers' viewpoints on the relevance of certain assumptions to avoid bug-introducing changes. In particular, they analyzed which assumptions developers can make during software development. ...
Article
Full-text available
Context. Harmful Code denotes code snippets that harm the software quality. Several characteristics can cause this, from characteristics of the source code to external issues. By example, one might associate Harmful Code with the introduction of bugs, architecture degradation, and code that is hard to comprehend. However, there is still a lack of knowledge on which code issues are considered harmful from the perspective of the software developers community. Goal. In this work, we investigate the social representations of Harmful Code among a community of software developers composed of Brazilian postgraduate students and professionals from the industry. Method. We conducted free association tasks with members from this community for characterizing what comes to their minds when they think about Harmful Code. Then, we compiled a set of associations that compose the social representations of Harmful Code. Results. We found that the investigated community strongly associates Harmful Code with a core set of undesirable characteristics of the source code, such as bugs and different types of smells. Based on these findings, we discuss each one of them to try to understand why those characteristics happen. Conclusion. Our study reveals the main characteristics of Harmful Code by a community of developers. Those characteristics can guide researchers on future works to better understand Harmful Code.
... In addition, Q has a small but growing presence in cybersecurity, information assurance, and information security domains. Researchers examined a range of topics, including business continuity management (Sapapthai et al., 2020), computer security (Cayubit et al., 2017), cyber investment (de Vries, 2017), cyber non-compliance (Ophoff & Renaud, 2021), cyber security (AlKalbani et al., 2023;Ramlo & Nicholas, 2020, information technology (IT) governance (Erasmus & Marnewick, 2021), information systems (Gauttier, 2021), personal information privacy (Bilal, 2022;Huang et al., 2022;Kulyk et al., 2023), and software development (Souza et al., 2022). However, according to recent InfoSec policy-related literature reviews, researchers have yet to employ Q (Cram et al., 2017;Rostami et al., 2020). ...
Thesis
Full-text available
Despite statutory mandates and increasing calls for federal agencies to strengthen their Information Security (InfoSec) policies and practices, evaluation adoption at select federal agencies remains challenging, as some struggle to collect data and make policy adjustments to improve effectiveness. Accordingly, this Q methodology study explored the perceptions of the United States Marine Corps’ high-level (enterprise-wide) program policy held by InfoSec policy development stakeholders. The rationale was to demonstrate how this novel approach to capturing subjectivity can produce actionable information/evidence for uptake by policymakers on what works, what does not, for whom, and why. In addition, the researcher assumed that an increased understanding of likely barriers to policy implementation would increase the potential for designing a more effective, evidence-based policy and achieving security goals. The purposefully selected participants comprise 17 stakeholders at a network-focused organization in the Marine Corps. The primary data collection method was a Q sort where participants rank-ordered their level of (dis)agreement of 42 statements representing the universe of viewpoints on policy quality into a forced-choice distribution grid. Data were analyzed using principal component analysis and varimax rotation. A supportive data collection method was postsort interviews, and the resulting data underwent thematic and pattern analysis to substantiate factor interpretations, thus contributing to a better understanding of the phenomenon. This research revealed three statistically distinct viewpoints and several matters of consensus and contention. The relationship between the viewpoints and stakeholder roles was examined using bivariate analysis and described through the lens of Social Representations Theory. The implications for theory, policy and practice, and subsequent research are discussed along with recommendations for the United States Marine Corps and further research possibilities. This research is significant for evaluators, researchers, decision-makers, policymakers, research sponsors, and others working toward institutionalizing evaluations within the public or private sector.
Chapter
Full-text available
Integrating research evidence into practice is one of the main goals of evidence-based software engineering (EBSE). Secondary studies, one of the main EBSE products, are intended to summarize the “best” research evidence and make them easily consumable by practitioners. However, recent studies show that some secondary studies lack connections with software engineering practice. In this chapter, we present the concept of Rapid Reviews, which are lightweight secondary studies focused on delivering evidence to practitioners in a timely manner. Rapid reviews support practitioners in their decision-making, and should be conducted bounded to a practical problem, inserted into a practical context. Thus, Rapid Reviews can be easily integrated in a knowledge/technology transfer initiative. After describing the basic concepts, we present the results and experiences of conducting two Rapid Reviews. We also provide guidelines to help researchers and practitioners who want to conduct Rapid Reviews, and we finally discuss topics that may concern the research community about the feasibility of Rapid Reviews as an evidence-based method. In conclusion, we believe Rapid Reviews might be of interest to researchers and practitioners working on the intersection of software engineering research and practice.
Conference Paper
Full-text available
The perceptions and attitudes of developers impact how software projects are run and which development practices are employed in development teams. Recent agile methodologies have taken this into account, focusing on collaboration and shared team culture. In this research, we investigate the perceptions of agile development practices and their usage in Scrum software development teams. Although perceptions collected through surveys of 42 participating students did not evolve significantly over time, our analyses show that the Scrum role significantly impacted participants' views of employed development practices. We find that using the version control system according to agile ideas was consistently rated most related to the values of the Agile Manifesto. Furthermore, we investigate how common software development artifacts can be used to gain insights into team behavior and present the development data measurements we employed. We show that we can reliably detect well-defined agile practices, such Test-Driven Development, in this data and that usage of these practices coincided with participants' self-assessments.
Article
Full-text available
Defect prediction has been an active research area for over four decades. Despite numerous studies on defect prediction, the potential value of defect prediction in practice remains unclear. To address this issue, we performed a mixed qualitative and quantitative study to investigate what practitioners think, behave and expect in contrast to research findings when it comes to defect prediction. We collected hypotheses from open-ended interviews and a literature review, followed by a validation survey. We received 395 responses from practitioners. Some of our key findings include: 1) Over 90% of respondents are willing to adopt defect prediction techniques. 2) There exists a disconnect between practitioners' perceptions and well supported research evidence regarding defect density distribution and the relationship between file size and defectiveness. 3) 7.2% of the respondents reveal an inconsistency between their behavior and perception regarding defect prediction. 4) Defect prediction at the feature level is the most preferred level of granularity by practitioners. 5) During bug fixing, more than 40% of the respondents acknowledged that they would make a "work-around" fix rather than correct the actual error-causing code. Based on our findings, we highlight future research directions and provide recommendations for practitioners.
Conference Paper
Full-text available
The prevalence of design problems may cause re-engineering or even discontinuation of the system. Due to missing, informal or outdated design documentation, developers often have to rely on the source code to identify design problems. Therefore, developers have to analyze different symptoms that manifest in several code elements, which may quickly turn into a complex task. Although researchers have been investigating techniques to help developers in identifying design problems, there is little knowledge on how developers actually proceed to identify design problems. In order to tackle this problem, we conducted a multi-trial industrial experiment with professionals from 5 software companies to build a grounded theory. The resulting theory offers explanations on how developers identify design problems in practice. For instance, it reveals the characteristics of symptoms that developers consider helpful. Moreover, developers often combine different types of symptoms to identify a single design problem. This knowledge serves as a basis to further understand the phenomena and advance towards more effective identification techniques. CCS CONCEPTS • Software and its engineering → Software design engineering ; KEYWORDS design problem, grounded theory, software design, symptoms ACM Reference Format:
Article
Full-text available
Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. To that end, we have conducted a comparative analysis of GitHub repositories that adopts TDD to a lesser or greater extent, in order to determine how TDD affects software development productivity and software quality. We classified GitHub repositories archived in 2015 in terms of how rigorously they practiced TDD, thus creating a TDD spectrum. We then matched and compared various subsets of these repositories on this TDD spectrum with control sets of equal size. The control sets were samples from all GitHub repositories that matched certain characteristics, and that contained at least one test file. We compared how the TDD sets differed from the control sets on the following characteristics: number of test files, average commit velocity, number of bug-referencing commits, number of issues recorded, usage of continuous integration, number of pull requests, and distribution of commits per author. We found that Java TDD projects were relatively rare. In addition, there were very few significant differences in any of the metrics we used to compare TDD-like and non-TDD projects; therefore, our results do not reveal any observable benefits from using TDD.
Conference Paper
Full-text available
Understanding developer productivity is important to deliver software on time and at reasonable cost. Yet, there are numerous definitions of productivity and, as previous research found, productivity means different things to different developers. In this paper, we analyze the variation in productivity perceptions based on an online survey with 413 professional software developers at Microsoft. Through a cluster analysis, we identify and describe six groups of developers with similar perceptions of productivity: social, lone, focused, balanced, leading, and goal-oriented developers. We argue why personalized recommendations for improving software developers’ work is important and discuss design implications of these clusters for tools to support developers’ productivity.
Article
Contemporary code review is a widespread practice used by software engineers to maintain high software quality and share project knowledge. However, conducting proper code review takes time and developers often have limited time for review. In this paper, we aim at investigating the information that reviewers need to conduct a proper code review, to better understand this process and how research and tool support can make developers become more effective and efficient reviewers. Previous work has provided evidence that a successful code review process is one in which reviewers and authors actively participate and collaborate. In these cases, the threads of discussions that are saved by code review tools are a precious source of information that can be later exploited for research and practice. In this paper, we focus on this source of information as a way to gather reliable data on the aforementioned reviewers' needs. We manually analyze 900 code review comments from three large open-source projects and organize them in categories by means of a card sort. Our results highlight the presence of seven high-level information needs, such as knowing the uses of methods and variables declared/modified in the code under review. Based on these results we suggest ways in which future code review tools can better support collaboration and the reviewing task.
Conference Paper
Testing is an integral part of the software development lifecycle, approached with varying degrees of rigor by different process models. Agile process models recommend Test Driven Development (TDD) as a key practice for reducing costs and improving code quality. The objective of this work is to perform a cost-benefit analysis of this practice. Previous work by Fucci et al. [2, 3] engaged in laboratory studies of developers actively engaged in test-driven development practices. Fucci et al. found little difference between test-first behaviour of TDD and test-later behaviour. To that end, we opted to conduct a study about TDD behaviours in the "wild" rather than in the laboratory. Thus we have conducted a comparative analysis of GitHub repositories that adopts TDD to a lesser or greater extent, in order to determine how TDD affects software development productivity and software quality. We classified GitHub repositories archived in 2015 in terms of how rigorously they practiced TDD, thus creating a TDD spectrum. We then matched and compared various subsets of these repositories on this TDD spectrum with control sets of equal size. The control sets were samples from all GitHub repositories that matched certain characteristics, and that contained at least one test file. We compared how the TDD sets differed from the control sets on the following characteristics: number of test files, average commit velocity, number of bug-referencing commits, number of issues recorded, usage of continuous integration, number of pull requests, and distribution of commits per author. We found that Java TDD projects were relatively rare. In addition, there were very few significant differences in any of the metrics we used to compare TDD-like and non-TDD projects; therefore, our results do not reveal any observable benefits from using TDD.
Article
All the mainstream programming languages in widespread use for mobile app development provide error handling mechanisms to support the implementation of robust apps. Android apps, in particular, are usually written in the Java programming language. Java includes an exception handling mechanism that allows programs to signal the occurrence of errors by throwing exceptions and to handle these exceptions by catching them. All the Android-specific abstractions, such as activities and asynctasks, can throw exceptions when errors occur. When an app catches the exceptions that it or the libraries upon which it depends throw, it can resume its activity or, at least, fail in a graceful way. On the other hand, uncaught exceptions can lead an app to crash, particularly if they occur within the main thread. Previous work has shown that, in real Android apps available at the Play Store, uncaught exceptions thrown by Android-specific abstractions often cause these apps to fail. This paper presents an empirical study on the relationship between the usage of Android abstractions and uncaught exceptions. Our approach is quantitative and maintenance-centric. We analyzed changes to both normal and exception handling code in 112 versions extracted from 16 software projects covering a number of domains, amounting to more than 3 million LOC. Change impact analysis and exception flow analysis were performed on those versions of the projects. The main finding of this study is that, during the evolution of the analyzed apps, an increase in the use of Android abstractions exhibits a positive and statistically significant correlation with the number of uncaught exception flows. Since uncaught exceptions cause apps to crash, this result suggests that these apps are becoming potentially less robust as a consequence of exception handling misuse. Analysis of multiple versions of these apps revealed that Android developers usually employ abstractions that may throw exceptions without adding the appropriate handlers for these exceptions. This study highlights the need for better testing and verification tools with a focus on exception handling code and for a change of culture in Android development or, at least, in the design of its APIs.