ArticlePDF Available

Experienced Benefits of Continuous Integration in Industry Software Product Development: A Case Study

Authors:

Abstract and Figures

In this paper, we present a multi-case study of industrial experiences of continuous integration among software professionals working in large scale development projects. In literature, multiple benefits of continuous integration are suggested, but case studies validating these benefits are lacking. This study investigates the extent to which continuous integration effects – increased developer productivity, increased project predictability, improved communication and enabling agile testing – suggested in literature are experienced in industry development projects. The study involves four independent products at different levels of continuous integration maturity within Ericsson AB. In each of these products developers, testers, project managers and line managers have been interviewed. Their experiences of continuous integration are quantitatively assessed and discussed in comparison to the continuous integration benefits proposed in related work.
Content may be subject to copyright.
EXPERIENCED BENEFITS OF CONTINUOUS INTEGRATION IN INDUSTRY
SOFTWARE PRODUCT DEVELOPMENT: A CASE STUDY
Daniel Ståhl Jan Bosch
Ericsson AB Department of Computer Science and Engineering
Datalinjen 3, 581 12 Linköping Chalmers University of Technology
Sweden Göteborg, Sweden
daniel.stahl@ericsson.com jan.bosch@chalmers.se
ABSTRACT
In this paper, we present a multi-case study of industrial
experiences of continuous integration among software
professionals working in large scale development
projects. In literature, multiple benefits of continuous
integration are suggested, but case studies validating these
benefits are lacking. This study investigates the extent to
which continuous integration effects increased
developer productivity, increased project predictability,
improved communication and enabling agile testing
suggested in literature are experienced in industry
development projects. The study involves four
independent products at different levels of continuous
integration maturity within Ericsson AB. In each of these
products developers, testers, project managers and line
managers have been interviewed. Their experiences of
continuous integration are quantitatively assessed and
discussed in comparison to the continuous integration
benefits proposed in related work.
KEY WORDS
Agile Software Development, Software Methodologies,
Continuous Integration
1. Introduction
Continuous integration was popularized in the late '90s as
part of eXtreme Programming [1]. It is a software
development practice where changes are integrated early
and often. There is a wide spectrum of proposed
beneficial effects of this practice in related work, such as
enabling agile testing, communicating development status
within the team or increasing developer productivity,
while others suggest an increase in project predictability.
In other words, there isn't one homogenous understanding
of what the exact consequences of introducing continuous
integration in software development are, and case studies
confirming claimed effects are lacking. Additionally,
there may be contextual differences between development
projects that impact the extent to which potential effects
of continuous integration manifest, which are not yet fully
understood.
This paper formulates, based on findings in related work,
a series of hypotheses of what the benefits of continuous
integration are. Then, through a quantitative case study of
industry development projects, the validity of each
hypothesis is examined.
The contribution of this paper is first that it uses an
industrial multi-case study of large-scale software
development to validate two hypotheses related to
continuous integration: it improves communication both
within and between teams and it improves project
predictability. Second, it questions another hypothesis:
continuous integration supports agile testing; the
empirical data of this study does not allow us to fully
validate this. Third, it is validated that continuous
integration increases developer productivity. In this case,
however, only one of the two reasons for this increase
suggested in related work is supported: while the effect of
continuous integration facilitating parallel development is
validated, the claim that it provides a significant reduction
of compilation and testing overhead prior to checking in
changes is not.
The remainder of this paper is organized as follows. In the
next section, the research method used for the study is
described. In section 3 the hypotheses based on claims
made by related work are formulated. The collection of
the case study data is discussed in section 4. In sections 5
the data and the hypotheses are examined, and the paper is
then concluded in section 6.
2. Research Method
The research was conducted by reviewing existing articles
on continuous integration. Based on these articles
hypotheses as to the effects of adopting continuous
integration were formulated. Data was then gathered
through interviews with industry software professionals,
and the data was investigated in order to validate or
disprove the hypotheses.
2.1 Study of Related Work
The first steps of systematic literature review were used to
find articles that make explicit claims as to benefits of
continuous integration. This resulted in a set of seven
articles (see section 3), some of which were in agreement,
but several of which proposed continuous integration
effects not mentioned in the other articles. This made it
clear that there are differing expectations on continuous
integration as a practice, and indeed differing experiences
of it. To capture and investigate the benefits proposed in
the selected articles, hypotheses describing the effects of
continuous integration in software development projects
were then formulated (see section 3).
2.2 Interviews
The most appropriate method to examine the validity of
the formulated hypotheses, we concluded, was by
investigating the extent to which software professionals
recognize the proposed benefits in their work. In order to
compile a representative data set we decided to interview
professionals both in different development projects and
working in different disciplines. Therefore four products
where chosen (see section 4.1), and in each of those
products interviews with developers, testers, project
managers and line managers were conducted. The
interviews were semi-structured, with questions prepared
to cover effects proposed in related work, but encouraging
interviewees to discuss and elaborate on their answers. In
some cases additional questions were added to the
interview guide as a result of such discussions: in cases
where the interviewee didn't necessarily disagree with an
effect, but suggested other causes than those stated in the
hypothesis we felt that it was important to capture this by
including it in subsequent interviews.
For each effect the interviewees were asked whether they
had experienced it from the continuous integration
implementation currently in place in their product
development. The interviewees were also asked to
quantify their answers by assigning them scores
representing the extent to which they perceived
continuous integration contributing to those effects. These
scores range from 0 (not at all), to 7 (very much).
In total, 22 individuals spread across the products
included in the study and representing the roles described
above were interviewed.
2.3 Investigation of Data
The answers provided by the interviewees were collated,
upon which average scores and standard deviation of
scores were calculated for each continuous integration
effect as well as for the entire data set.
The average scores assigned by the interviewees were
used to determine whether their perceptions and
experiences support the formulated hypotheses, and the
standard deviations were used to determine to what extent
there was consensus among them.
Some interviewees refrained from answering certain
questions. This was due to one of two reasons: either the
question was not understood, or they did not consider
themselves to have the experience or be in a position
where they could tell whether a particular continuous
integration effect was manifest. Regardless of the reason,
such missing answers were omitted from the data set and
consequently not included in the calculation of score
averages and standard deviations.
3. Hypotheses
To find articles based on which hypotheses concerning
the effects of continuous integration could be formulated
the first steps of systematic literature review were used.
The IEEE Xplore [2] database was searched for
publications with the terms “continuous”, “integration”
and “software” either in their titles or abstracts, which
returned 361 matches.
From reading the abstracts of these papers we determined
that a large number of these 361 articles, however, did not
deal with the software practice of continuous integration,
but rather with subjects such as signal processing and
artificial intelligence. From those dealing with the
software practice of continuous integration, a smaller set
of 33 articles were found to potentially make concrete and
tangible claims as to benefits of continuous integration.
These articles were reviewed in full, and the hypotheses
below were formulated to represent explicitly proposed
benefits of continuous integration effects found in seven
of them [3-9].
Hypothesis 1: Continuous integration
supports the agile testing practices of
automated customer acceptance tests and
writing unit tests in conjunction with
production code. In [3], continuous integration
is considered essential in supporting agile
testing. Agile testing is in this context considered
to involve practices such as customer defined
acceptance tests, automating said tests and
running them in the regression test suite at least
daily, as well as developing unit tests for all new
code during a sprint (an iterative sprint based
process was used) and then running those unit
tests with every build.
Hypothesis 2: Continuous integration
contributes to improved communication both
within and between teams. The problems of
effectively communicating within a development
team are investigated in [4], which states that the
continuous integration constituted a significant
part in how the team communicated and
consequently had a significant effect on the work
flow. However, while [4] investigated a single
team, we consider it to be equally important to
cover large scale development projects where
more than one team is involved. Thus the
hypothesis is phrased in such a way as to capture
both intra-team and inter-team communication.
Hypothesis 3: Continuous integration
contributes to increased developer
productivity as an effect of facilitating parallel
development in the same source context and
reduced compiling and testing locally before
checking in. It is reported from one project
switching to continuous integration [5] that it
allowed them to easily maintain more than one
line of development, thereby increasing the flow
of changes. Additionally, [6] calculates the net
gains of adopting continuous integration by
measuring time saved by the developers not
compiling and testing locally before checking in
versus the time spent on the continuous
integration framework itself. This implies that
the primary benefit of continuous integration is
as a time saver for developers. The third
hypothesis was phrased to cover both of these
proposed benefits.
Hypothesis 4: Continuous integration
improves project predictability as an effect of
finding problems earlier. [7] claims that with
continuous integration, the company studied in
the article is now able to release frequently and
predictably. [8] finds that continuous integration
testing is an effective method of discovering
bugs continuously, and [9] supports this view,
claiming that continuous integration is helpful in
finding problems earlier rather than later.
4. Data Collection
This section describes the data collection process.
4.1 Studied Products
This section describes the four products included in the
case study. All of these are Ericsson products, due to the
ease with which we were able to obtain access to them,
but they are all developed by independent organizations
within the company. From the many dozens of
development projects in Ericsson these four were selected
to ensure a good distribution and representativeness of
software development projects at large. We wanted to
capture projects with very short experience of continuous
integration, as well as those with relatively longer
experience. We also wanted to represent products where
the continuous integration was explicitly broken into
stages, with later stages focused on the integration of pre-
built binary components, similar to the approach
described in [10], as well as products where this is not the
case. In other words, we wanted to include one product
representing each quadrant of the diagram shown in figure
1.
It shall be noted that all of the studied products were, at
the time of the study, being actively developed by
multiple teams.
4.1.1 Product A
Product A is a network node, developed by tens of cross-
functional development teams. Each team delivers into
the product mainline every few months, on average. In
other words, new changes are integrated into the product
mainline, with the product consequently being rebuilt and
retested several times a week. At the time of the case
study, the development organization of product A was in
the planning stage of their continuous integration
implementation and was chosen to represent products
with short experience and lesser focus on binary
components (see figure 1).
4.1.2 Product B
Product B is actually a portfolio of products that interacts
with a wide array of network nodes. Unlike the other
products in the case study, which are considering or have
adopted continuous integration at a mature stage in the
product life cycle, product B was designed for large scale
continuous integration from the outset, using a binary
integration based approach similar to that described by
[10]. At the time of the case study the portfolio was still in
its first year of development. This product was chosen to
represent products integrating binary components, but
with short experience of continuous integration (see figure
1). Measured in number of teams, product B was at the
time of writing the smallest product in the case study,
with less than ten cross-functional development teams.
4.1.3 Product C
The software delivered by product C is developed by tens
of cross functional teams and integrated into a large
number of hardware and software configurations, placing
strict requirements on variability and extensive
verification. Each team pushes into the product mainline
approximately once a week, on average, using a time slot
reservation system. In this regard, the continuous
integration of product C can be described as a quicker
variant of the process previously in place in product A.
This illustrates the elasticity of how the term continuous
Figure 1: The four quadrants of software development
projects represented in the case study.
integration is sometimes used in large scale software
development.
Having a similar approach as product A to building the
product, but longer experience, product C was chosen to
represent the upper left quadrant of figure 1.
4.1.4 Product D
Product D is a network node, developed by tens of cross-
functional development teams. Several years prior to the
case study the development organization switched from a
previous integration approach, with late integration of
large changes, to continuous integration. Their continuous
integration setup compiles changed modules in the first
stage and then, via binary integration, builds and verifies
the full node.
For these reasons, product D was chosen to represent
product development with a longer experience of
continuous integration as well as a greater focus on binary
components.
4.2 Interview Guide
The interview guide consisted of eleven questions,
designed to address the four hypotheses (see section 3).
4.2.1 Supporting Agile Testing
The first hypothesis states that continuous integration
supports the agile testing practices of automated customer
acceptance tests and writing unit tests in conjunction with
production code. To examine this hypothesis, two
questions were included in the interview guide:
1. To what extent have you experienced continuous
integration supporting agile testing, in the sense
of automated customer acceptance tests?
2. To what extent have you experienced continuous
integration supporting agile testing, in the sense
of writing unit tests in conjunction with new
production code?
4.2.2 Improving Communication
The second hypothesis states that continuous integration
contributes to improved communication both within and
between teams. This hypothesis was addressed by the
following questions:
3. To what extent have you experienced continuous
integration contributing to improved intra-team
communication?
4. To what extent have you experienced continuous
integration contributing to improved inter-team
communication?
4.2.3 Increasing Developer Productivity
The following questions were designed to address the
hypothesis that continuous integration contributes to
increased developer productivity as an effect of
facilitating parallel development in the same source
context and reduced compiling and testing locally before
checking in:
5. To what extent have you experienced continuous
integration improving developer productivity, as
an effect of less local compiling and testing
before checking in?
6. To what extent have you experienced continuous
integration facilitating parallel development in
the same source context?
In addition to these questions, it was suggested during the
study that there may be other causes of potentially
increased developer productivity, and so the following
questions were also included in the interview guide:
7. To what extent have you experienced continuous
integration improving developer productivity as
an effect of easier re-basing and merging?
8. To what extent have you experienced continuous
integration improving developer productivity as
an effect of more effective troubleshooting?
4.2.4 Improving Project Predictability
The final hypothesis is that continuous integration
improves project predictability as an effect of finding
problems earlier. The following question was included to
examine this:
9. To what extent have you experienced continuous
integration improving project predictability, as
an effect of finding problems early?
While related work is largely focused on describing unit
tests and (functional) acceptance tests in relation to
continuous integration, we were also curious about
whether early non-functional system testing is improving
predictability in the industry:
10. To what extent have you experienced continuous
integration improving project predictability, as
an effect of early non-functional system testing?
Furthermore, it was suggested during the study that
predictability also increases because integration is
performed outside of the project's critical path. The
reasoning here is that small, early and incremental
integrations can be done in parallel with development,
while traditional “big bang” integrations towards the end
of the project inevitably take place on the critical path.
We considered it to be worth dedicating an extra question
to this:
11. To what extent have you experienced continuous
integration improving project predictability, as
an effect of integration taking place outside of
the critical path?
4.3 Interviewees
To ensure a sufficient distribution of interviewees in each
of the projects developing the studied products, managers
of these products were asked to suggest interviewees
representing developers, testers, project managers and line
managers. One to two representatives of each role in each
product were interviewed.
5. Hypotheses and Data Examination
This section examines the data gathered in the study and
discusses it in relation to the formulated hypotheses.
5.1 Examination of Data
A total of eleven questions pertaining to the formulated
hypotheses are included in the data set. Across all eleven
questions, the average standard deviation of scores
assigned by interviewees (on a scale of 0 to 7, see section
2.2) to their experiences of continuous integration effects
was 2.30. We believe that this high standard deviation
reveals a large amount of disagreement. Even though the
exact causes of this disagreement are currently not
understood, we nevertheless find this to be an interesting
result in itself: it supports the view that perceptions and
experiences of continuous integration differ (see section
2.1).
In order to determine whether a hypothesis was validated
by the empirical data, the average score for the questions
pertaining to that hypothesis was used: an average of 3.5
or above is considered a validation. For hypotheses with
multiple stipulated causes, all causes must be validated for
the hypothesis itself to be fully validated. Also note that
some questions (e.g. questions 10 and 11) are not
designed to validate any hypotheses, but merely to
provide additional data. Furthermore, we do not consider
a score below 3.5 to necessarily rule out the validity of an
effect, but instead may be an indication of a correlation
that is not yet fully understood.
5.2 Examination of Hypotheses
This section discusses each hypothesis in turn and
presents the results of the questions pertaining to those
hypotheses. The scores given by the interviewees in
response to these questions are depicted in figures 2
through 5. In each figure, the average score is displayed
as a horizontal bar. Also, the distribution of responses is
represented as a candlestick chart [11], with the vertical
bar representing the minimum and maximum scores,
while the lower and upper boundaries of the box represent
the first and third quartiles respectively.
5.2.1 Supporting Agile Testing
The first hypothesis is that continuous integration
supports the agile testing practices of automated customer
acceptance tests and writing unit tests in conjunction with
production code, which is addressed by questions 1 and 2
(see section 4.2.1). The scores are displayed in figure 2.
Question 1 (support of agile testing in the sense of
automated customer acceptance tests) received an average
score of 2.82, with a standard deviation 1.90.
Question 2 (support of agile testing in the sense of unit
tests written in conjunction with production code)
received an average score of 3.77, with a standard
deviation of 2.33.
The support for the hypothesis from this result is
ambiguous: while the interviewees perceive an effect on
supporting unit tests written in conjunction with
production code, their experience of the customer
acceptance test effect is significantly weaker.
Some of the interviewees suggested that this is because
they don't have any direct interaction with end customers
rather their “customers” tend to be another department
or internal testers – and that they didn't make use of this
particular testing process.
Even though scores of question 2 are significantly higher,
it was argued several times during the study that this
should not be considered so much an effect of continuous
integration, as a prerequisite.
In conclusion, we find that our study can not fully validate
this hypothesis, although it is partly supported. It does
appear clear, however, that there is a correlation between
agile testing practices and continuous integration (and a
point could arguably be made that continuous integration
itself is one such practice), but the exact nature of this
correlation is not fully understood. In particular, it
remains unclear what is cause and what is effect, and what
the contextual prerequisites of successful interaction
between continuous integration and customer acceptance
tests are. It is also unclear to what extent the particular
circumstances of the studied products affect the ability of
these benefits to manifest: as hinted at by the interviewees
themselves, it is possible that in a different context the
support for automated customer acceptance tests would be
more pronounced.
Figure 2: Average scores assigned by interviewees in
response to questions 1 and 2 (see section 4.2.1). The
lower and upper bounds of the boxes represent the first
and third quartiles, respectively.
5.2.2 Improving Communication
The second hypothesis states that continuous integration
contributes to improved communication both within and
between teams, which is addressed by questions 3 and 4
(see section 4.2.2). The scores are displayed in figure 3.
Question 3 (improving intra-team communication)
received an average score of 3.77, with a standard
deviation of 2.83.
Question 4 (improving inter-team communication)
received an average score of 3.85, with a standard
deviation of 2.44.
The data gathered in the case study validates the
hypothesis, both on account of intra-team and inter-team
communication. It should be noted, however, that the
average scores for these questions are not exceedingly
high, while the standard deviation is above average. But
this does not tell the full story: looking at the distribution
in figure 3 it becomes evident that experiences,
particularly in response to question 3, are very polarized.
Indeed, elaborating on these questions, interviewees gave
diverging accounts of how product build and quality
status was communicated in their respective products.
Therefore we believe that further investigation into how
differences in continuous integration implementations
affect these potential benefits is warranted.
5.2.3 Increasing Developer Productivity
The third hypothesis is that continuous integration
contributes to increased developer productivity as an
effect of facilitating parallel development in the same
source context and less compiling and testing locally
before checking in, which is addressed by questions 5 and
6. In addition, questions 7 and 8 address the same effect,
but suggest different causes (see section 4.2.3). The scores
are displayed in figure 4.
Question 5 (increased productivity as an effect of less
local compiling and testing before checking in) received
an average score of 2.08, with a standard deviation of
2.79.
Question 6 (increased productivity as an effect of
facilitating parallel development) received an average
score of 4.91, with a standard deviation of 2.02.
Question 7 (increased productivity as an effect of easier
re-basing and merging) received an average score of 3.46,
with a standard deviation of 2.95.
Question 8 (increased productivity as an effect of more
effective troubleshooting) received an average score of
2.92, with a standard deviation of 2.18.
There are striking differences in how interviewees rate the
contribution to developer productivity, depending on
which cause the question addresses. Increased
productivity as an effect of easier re-basing and more
effective troubleshooting receive moderate support, while
the time saving aspect (question 5) received very low
scores. All of these effects also had medium to high
standard deviation, indicating differing opinions. Figure 4
clearly shows that questions 5 and 7, especially, received
polarized responses which warrant further investigation.
In contrast, in the experience of the interviewed software
professionals in this study, there is a significant effect in
facilitating parallel development in the same source
context (question 6). Here the standard deviation is also
relatively low, with first and third quartiles close together,
indicating a higher degree of consensus. It is worth
noting, however, that even so there are individuals who do
not perceive this effect at all.
In conclusion, we find that the case study partly supports
the hypothesis, while also indicating that there may be
other (albeit lesser) causes for increased productivity,
such as more effective troubleshooting and easier re-
basing and merging.
It should be noted that in addition to the low scores
assigned to the time saving effect, as suggested by [6],
one interviewee went as far as to claim that in fact the
exact opposite is true: as a consequence of adopting
continuous integration, developers become more careful
Figure 3: Average scores assigned by interviewees in
response to questions 3 and 4 (see section 4.2.2). The
lower and upper bounds of the boxes represent the first
and third quartiles, respectively.
Figure 4: Average scores assigned by interviewees in
response to questions 5, 6, 7 and 8 (see section 4.2.3).
The lower and upper bounds of the boxes represent the
first and third quartiles, respectively.
in verifying their changes before checking them in, as it is
considered of utmost importance to not break the build
needlessly. Indeed, this point of view also has support in
literature [12]. However, this does not necessarily rule out
the validity of the effect itself: there may be differences in
context that determine to what extent certain benefits can
manifest. For instance, all the development projects of
this study are significantly larger than that studied in [6],
containing many times the number of development teams.
It is conceivable that such environmental factors can
influence whether it's beneficial for developers to skip
local compiling and testing before checking in. The high
standard deviation can be interpreted as supporting this
assumption there is simply significant disagreement
among the interviewees as to whether this is a benefit of
continuous integration or not but currently it is not
understood what those factors are or what their influence
is.
5.2.4 Improving Project Predictability
The fourth hypothesis is that continuous integration
improves project predictability as an effect of finding
problems earlier, which is addressed by question 9.
Additionally, questions 10 and 11 are also about improved
project predictability, but suggest different reasons for it
(see section 4.2.4). The scores are displayed in figure 5.
Question 9 (improving predictability as an effect of
finding problems early) received an average score of 4.77,
with a standard deviation of 1.53.
Question 10 (improving predictability as an effect of early
non-functional system testing) received an average score
of 3.42, with a standard deviation of 1.98.
Question 11 (improving predictability as an effect of
performing integration outside of the critical path)
received an average score of 3.67, with a standard
deviation of 2.36.
The gathered data clearly supports the hypothesis that
continuous integration improves project predictability as
an effect of finding problems early. Not only did this
effect receive the second highest average score in the data
set: it also received the lowest standard deviation and was
the only effect that every individual participating in the
study perceived to some extent. This indicates a strong
consensus among the interviewees.
In addition to this, the data suggests that predictability
may also be improved by early non-functional system
testing and integrating outside of the project's critical
path, although to a lesser degree. It deserves to be pointed
out, that even though the average scores of questions 10
and 11 were not among the highest in the data set, they do
not show the same polarization in responses as questions
pertaining to communication and productivity (see
sections 5.2.2 and 5.2.3).
5.3 Validation of Hypotheses
Based on the data presented above, we find that the first
hypothesis (continuous integration supports the agile
testing practices of automated customer acceptance tests
and writing unit tests in conjunction with production
code) can not be validated by this case study, although
interesting questions remain to be answered.
Furthermore, we find that the second hypothesis
(continuous integration contributes to improved
communication both within and between teams) is
validated, with the caveat that experiences are very
disparate.
In addition, we find that the fourth hypothesis (continuous
integration improves project predictability as an effect of
finding problems earlier) is validated.
Finally, the third hypothesis (continuous integration
contributes to increased developer productivity as an
effect of facilitating parallel development in the same
source context and less compiling and testing locally
before checking in) is partly validated: we found support
for the first of the stipulated causes, but not the latter.
6. Conclusion
It is our conclusion that there exists not one, but several
benefits to continuous integration. We also find that each
of our hypotheses (see section 3) is, at least partly,
supported by the collected data, even in cases where they
can not be unambiguously validated.
It is shown that there is a relationship between continuous
integration and the agile testing methods of automated
customer acceptance tests and writing unit tests in
conjunction with new production code. There are,
however, unanswered questions raised as to whether
continuous integration supports the unit test practice, or if
it's the other way around, or if they support each other. It
is also difficult to isolate the effect of continuous
integration on the practice of automated customer
acceptance testing from contextual factors (such as
organizational structure, culture and customer
availability) in our data.
Figure 5: Average scores assigned by interviewees in
response to questions 9, 10 and 11 (see section 4.2.4).
The lower and upper bounds of the boxes represent the
first and third quartiles, respectively.
Furthermore the data shows, with the caveat that
interviewees report disparate experiences, that continuous
integration is generally perceived as having a positive
effect on communication not just within the team, as
suggested by [4], but in larger projects also between
teams.
There is also strong support for the hypothesis that
continuous integration improves developer productivity,
but only for one of the reasons originally stipulated: while
there is a strong consensus in the study that it facilitates
parallel development, there is very weak support for the
time saving aspect proposed by [6]. It shall be noted,
however, that in this case there is an exceptional lack of
consensus: a number of interviewees do perceive this
effect very strongly. This leaves us with an unanswered
question: why do the experiences from this proposed
effect differ? Are there factors that determine whether it's
a benefit or not to save time by not compiling and testing
before checking in?
In addition to the causes for improved productivity
mentioned in the original hypothesis, we also found
support for productivity increases due to more effective
troubleshooting and easier re-basing and merging,
although to a lesser extent than the parallel development
effect.
We also find that our case study strongly supports the
hypothesis that continuous integration improves project
predictability by finding problems earlier. In addition to
this, it is also shown that predictability is further
improved because non-functional system testing can be
performed early and integration can be performed outside
of the project's critical path.
Additionally, we find no reason to believe that the effects
discussed in this article constitute an exhaustive list of
continuous integration benefits. It should also be noted
that potentially negative effects of continuous integration
have not been discussed, which we believe to be an
important topic of further research.
Finally, it is our conclusion that, for most effects, the
standard deviation of responses is very high, indicating
that experiences of continuous integration differ. Viewed
in light of the wide spectrum of continuous integration
benefits described in literature, however, it is not
altogether surprising. We do, however, believe that this is
an important area that deserves further investigation: what
are the causes of these differences in experienced
continuous integration effects? Are they due to individual
perception, differences in culture and process between the
development projects, or inherent differences in the
products being developed? Or could it be that the concept
of continuous integration has been interpreted and
implemented in different ways? At the time of writing this
is not clear. It is conceivable, however, that a better
understanding of potential differences in continuous
integration implementations and their effects could help
software development projects to shape their continuous
integration in such a way as to optimize for the benefits
they seek to achieve.
Acknowledgements
We would like to extend our sincere thanks to all the
interviewees who shared their valuable time and
experiences with us. We also want to thank Ericsson AB
for allowing us to study their product development.
References
[1] K. Beck, Extreme Programming Explained (Addison-
Wesley Professional, 2000).
[2] IEEE Xplore, http://ieeexplore.ieee.org
[3] S. Stolberg, Enabling Agile Testing Through
Continuous Integration, Agile 2009 Conference, Chicago,
IL, 2009.
[4] J. Downs, J. Hoskins, B. Plimmer, Status
Communication in Agile Software Teams: A Case Study,
Fifth International Conference on Software Engineering
Advances, Nice, France, 2010.
[5] F. J. Lacoste, Killing the Gatekeeper: Introducing a
Continuous Integration System, Agile 2009 Conference,
Chicago, IL, 2009.
[6] A. Miller, A Hundred Days of Continuous
Integration, Agile 2008 Conference, Toronto, Canada,
2008.
[7] D. Goodman, M. Elbaz, "It's Not the Pants, it's the
People in the Pants" Learnings from the Gap Agile
Transformation What Worked, How We Did it, and What
Still Puzzles Us, Agile 2008 Conference, Toronto,
Canada, 2008.
[8] H. Liu, Z. Li, J. Zhu, H. Tan, H. Huang, A Unified
Test Framework for Continuous Integration Testing of
SOA solutions, 2009 IEEE International Conference on
Web Services, Los Angeles, CA, 2009.
[9] B. Boehm & R.Turner, Management Challenges to
Implementing Agile Processes in Traditional
Development Organizations, IEEE Software, 22(5), 2005,
30-39.
[10] M. Roberts, Enterprise Continuous Integration Using
Binary Dependencies, 5th International Conference on
Extreme Programming and Agile Processes in Software
Engineering, Garmisch-Partenkirchen, Germany, 2004.
[11] G. Morris, Candlestick Charting Explained,
(McGraw-Hill Professional, 2006).
[12] J. Humble, D. Farley, Continuous Delivery,
(Addison-Wesley, 2011).
... The aforementioned SLRs included many publications on how CI/CD practices have been implemented in different environments in order to identify their potential benefits (Chen 2015;Ståhl and Bosch 2013;Hilton et al. 2016;Bernardo et al. 2018;Elazhary et al. 2022), challenges and shortcomings (Beller et al. 2017;Chen 2017;Rausch et al. 2017). This reflects the importance of CI/CD practices and their impact on software development practices. ...
... They reported that developers use CI for 8 different reasons: to help catch bugs earlier; to avoid breaking builds; to provide a common build environment; to deploy more often; to allow faster iterations; to make integration easier; to enforce a specific workflow; and to allow testing across multiple platforms. Many other qualitative studies have reported similar reasons for using CIs (Fowler and Foemmel 2006;Duvall et al. 2007;Ståhl and Bosch 2013;Leppänen et al. 2015;Vasilescu et al. 2015;Rahman et al. 2018;Bernardo et al. 2018). The SLR (Soares et al. 2022) mentioned the following reasons: improved software quality, stability, predictability, and transparency; faster build, integration, and release cycles; improved productivity, efficiency, and developer confidence; reduced workload; and faster detection and resolution of defects. ...
Article
Full-text available
Continuous integration, delivery and deployment (CI/CD) is used to support the collaborative software development process. CI/CD tools automate a wide range of activities in the development workflow such as testing, linting, updating dependencies, creating and deploying releases, and so on. Previous quantitative studies have revealed important changes in the landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many software projects migrating to other CI/CD tools. In order to understand the reasons behind these changes in CI/CD usage, this paper presents a qualitative study based on in-depth interviews with 22 experienced software practitioners reporting on their usage, co-usage and migration of 31 different CI/CD tools. Following an inductive and deductive coding process, we analyse the interviews and found a high amount of competition between CI/CD tools. We observe multiple reasons for co-using different CI/CD tools within the same project, and we identify the main reasons and detractors for migrating to different alternatives. Among all reported migrations, we observe a clear trend of migrations away from Travis and migrations towards GitHub Actions and we identify the main reasons behind them.
Article
Full-text available
The increasing importance of software as an essential functional provider in products and processes requires that companies master the development capabilities to continiously and quickly deliver high-quality software features. DevOps (acronym for Development and Operations) is an essential approach to these capabilities, which has so far been used predominantly in software-driven companies. This paper investigates the conditions under which DevOps can also be used in the industrial context of series manufacturing in order to be able to steadily develop and provide software features used in these environments (e.g. for monitoring and maintenance of manufacturing machines). A concept for DevOps for manufacturing systems was developed based on current best practices and the specific situation in the industrial company. The concept was then implemented and validated with experts on the basis of initial development cycles, demonstrating the usefulness of DevOps for manufacturing systems.
Article
Continuous integration (CI) is a popular practice in modern software engineering. Unfortunately, it is also a high-cost practice — Google and Mozilla estimate their CI systems in millions of dollars. To reduce the computational cost in CI, researchers developed approaches to selectively execute builds or tests that are likely to fail (and skip those likely to pass). In this paper, we present a novel hybrid technique ( Hybrid CIS ave ) to improve on the limitations of existing techniques: to provide higher cost savings and higher safety. To provide higher cost savings, Hybrid CIS ave combines techniques to predict and skip executions of both full builds that are predicted to pass and partial ones (only the tests in them predicted to pass). To provide higher safety, Hybrid CIS ave combines the predictions of multiple techniques to obtain stronger certainty before it decides to skip a build or test. We evaluated Hybrid CIS ave by comparing its effectiveness with the existing build selection techniques over 100 projects, and found that it provided higher cost savings at the highest safety. We also evaluated each design decision in Hybrid CIS ave and found that skipping both full and partial builds increased its cost savings and that combining multiple test selection techniques made it safer.
Chapter
Context: Exploratory testing plays an important role in the continuous integration and delivery pipelines of large-scale software systems, but a holistic and structured approach is needed to realize efficient and effective exploratory testing. Objective: This paper seeks to address the need for a structured and reliable approach by providing a tangible model, supporting practitioners in the industry to optimize exploratory testing in each individual case. Method: The reported study includes interviews, group interviews and workshops with representatives from six companies, all multi-national organizations with more than 2,000 employees. Results: The ExET model (Excellence in Exploratory Testing) is presented. It is shown that the ExET model allows companies to identify and visualize strengths and improvement areas. The model is based on a set of key factors that have been shown to enable efficient and effective exploratory testing of large-scale software systems, grouped into four themes: “The testers’ knowledge, experience and personality”, “Purpose and scope”, “Ways of working” and “Recording and reporting”. Conclusions: The validation of the ExET model showed that the model is novel, actionable and useful in practice, showing companies what they should prioritize in order to enable efficient and effective exploratory testing in their organization.
Chapter
Continuous experimentation (CE) refers to a group of practices used by software companies to rapidly assess the usage, value and performance of deployed software using data collected from customers and the deployed system. Despite its increasing popularity in the development of web-facing applications, CE has not been discussed in the development process of business-to-business (B2B) mission-critical systems. We investigated in a case study the use of CE practices within several products, teams and areas inside Ericsson. By observing the CE practices of different teams, we were able to identify the key activities in four main areas and inductively derive an experimentation process, the HURRIER process, that addresses the deployment of experiments with customers in the B2B and with mission-critical systems. We illustrate this process with a case study in the development of a large mission-critical functionality in the Long Term Evolution (4G) product. In this case study, the HURRIER process is not only used to validate the value delivered by the solution but to increase the quality and the confidence from both the customers and the R&D organization in the deployed solution. Additionally, we discuss the challenges, opportunities and lessons learned from applying CE and the HURRIER process in B2B mission-critical systems.
Chapter
Measuring properties of software systems, organizations, and processes has much more to it than meets the eye. Numbers and quantities are at the center of it, but that is far from everything. Software measures (or metrics, as some call them) exist in a context of a measurement program, which involves the technology used to measure, store, process, and visualize data, as well as people who make decisions based on the data and software engineers who ensure that the data can be trusted. z
Chapter
Software developers in big and medium-size companies are working with millions of lines of code in their codebases. Assuring the quality of this code has shifted from simple defect management to proactive assurance of internal code quality. Although static code analysis and code reviews have been at the forefront of research and practice in this area, code reviews are still an effort-intensive and interpretation-prone activity. The aim of this research is to support code reviews by automatically recognizing company-specific code guidelines violations in large-scale, industrial source code. In our action research project, we constructed a machine-learning-based tool for code analysis where software developers and architects in big and medium-sized companies can use a few examples of source code lines violating code/design guidelines (up to 700 lines of code) to train decision-tree classifiers to find similar violations in their codebases (up to 3 million lines of code). Our action research project consisted of (i) understanding the challenges of two large software development companies, (ii) applying the machine-learning-based tool to detect violations of Sun’s and Google’s coding conventions in the code of three large open source projects implemented in Java, (iii) evaluating the tool on evolving industrial codebase, and (iv) finding the best learning strategies to reduce the cost of training the classifiers. We were able to achieve the average accuracy of over 99% and the average F-score of 0.80 for open source projects when using ca. 40K lines for training the tool. We obtained a similar average F-score of 0.78 for the industrial code but this time using only up to 700 lines of code as a training dataset. Finally, we observed the tool performed visibly better for the rules requiring to understand a single line of code or the context of a few lines (often allowing to reach the F-score of 0.90 or higher). Based on these results, we could observe that this approach can provide modern software development companies with the ability to use examples to teach an algorithm to recognize violations of code/design guidelines and thus increase the number of reviews conducted before the product release. This, in turn, leads to the increased quality of the final software.
Chapter
Continuous Integration is a software practice where developers integrate frequently, at least daily. While this is an ostensibly simple concept, it does leave ample room for interpretation: what is it the developers integrate with, what happens when they do, and what happens before they do? These are all open questions with regards to the details of how one implements the practice of continuous integration, and it is conceivable that not all such implementations in the industry are alike. In this paper we show through a literature review that there are differences in how the practice of continuous integration is interpreted and implemented from case to case. Based on these findings we propose a descriptive model for documenting and thereby better understanding implementations of the continuous integration practice and their differences. The application of the model to an industry software development project is then described in an illustrative case study.
Chapter
Measurement programs in large software development organizations contain a large number of indicators, base and derived measures to monitor products, processes and projects. The diversity and the number of these measures causes the measurement programs to become large, combining multiple needs, measurement tools and organizational goals. For the measurement program to effectively support organization’s goals, it should be scalable, automated, standardized and flexible – i.e. robust. In this paper we present a method for assessing the robustness of measurement programs. The method is based on the robustness model which has been developed in collaboration between seven companies and a university. The purpose of the method is to support the companies to optimize the value obtained from the measurement programs and their cost. We evaluated the method at the seven companies and the results from applying the method to each company quantified the robustness of their programs, reflecting the real-world status of the programs and pinpointed strengths and improvements of the programs.
Article
Full-text available
Discussions with traditional developers and managers concerning agile software development practices nearly always contain two somewhat contradictory ideas. They find that on small, stand-alone projects, agile practices are less burdensome and more in tune with the software industry's increasing needs for rapid development and coping with continuous change. Managers face several barriers, real and perceived, when they try to bring agile approaches into traditional organizations. They categorized the barriers either as problems only in terms of scope or scale, or as significant general issues needing resolution. From these two categories, we've identified three areas - development process conflicts, business process conflicts, and people conflicts - that we believe are the critical challenges to software managers of large organizations in bringing agile approaches to bear in their projects.
Conference Paper
Continuous Integration (CI) is a well-established practice which allows us as developers to experience fewer development conflicts and achieve rapid feedback on progress. CI by itself though becomes hard to scale as projects get large or have independent deliverables. Enterprise Continuous Integration (ECI) is an extension to CI that helps us regain the benefits of CI when working with separately developed, yet interdependent modules. We show how to develop an ECI process based upon binary dependencies, giving examples using existing .NET tools.
Conference Paper
The quality of service oriented architecture (SOA) solutions is becoming more and more important along with the increasing adoption of SOA. Continuous integration testing (CIT) is an effective technology to discover bugs as early as possible. However, the diversity of programming models used in an SOA solution and the distribution nature of an SOA solution pose new challenges for CIT. Existing testing frameworks more focus on the integration testing of applications developed by a single programming model. In this paper, a unified test framework is proposed to overcome these limitations and enable the CIT of SOA solutions across the whole development lifecycle. This framework is designed following the model driven architecture (MDA). The information of an executable test case is separated into two layers: the behavior layer and the configuration layer. The behavior layer represents the test logic of a test case and is platform independent. The configuration layer contains the platform specific information and is configurable for different programming models. An extensible and pluggable test execution engine is specially designed to execute the integration test cases. A global test case identifier instrumentation approach is used to merge the distributed test case execution traces captured by ITCAM - an IBM integrated management tool. A verification approach supporting Boolean expression and back-end service interaction verification is proposed to verify the test execution result. Initial experiments have shown the effectiveness of this unified test framework.
Conference Paper
Developers ought to maintain awareness of the status of a software project. However, there are very few recorded best practices for defining what constitutes relevant status information and the appropriate modalities for communicating this information. In this industry case study, we conducted in-depth interviews with members of an agile development team. We found that their daily work practices, while well-defined and regular, were heavily influenced by the status information they integrated from a number of sources. In particular, continuous integration builds had a substantial effect on the team's workflow. Based on our findings, we provide a set of guidelines for build monitoring systems which encourage collective and individual responsibility while working within the established team environment.
Conference Paper
This is the story of how the Launchpad (https://launchpad.net) development team switched to a continuous integration system to increase several flows in their development process: flow of changes on trunk; flow of changes requiring database schema upgrade; flow of deployed changes to end users. The switch to a buildbot based system meant violating a very old company taboo: a trunk that doesn't pass its test suite. The risk of a broken trunk was offset by allowing each developer to run the full test suite in the Amazon EC2 cloud.
Conference Paper
A continuous integration system is often considered one of the key elements involved in supporting an agile software development and testing environment. As a traditional software tester transitioning to an agile development environment it became clear to me that I would need to put this essential infrastructure in place and promote improved development practices in order to make the transition to agile testing possible. This experience report discusses a continuous integration implementation I led last year. The initial motivations for implementing continuous integration are discussed and a pre and post-assessment using Martin Fowler's" practices of continuous integration" is provided along with the technical specifics of the implementation. The report concludes with a retrospective of my experiences implementing and promoting continuous integration within the context of agile testing.
Conference Paper
Many agile teams use continuous integration (CI). It is one of the extreme programming practices and has been broadly adopted by the community [1]. Just how effective is it? Does the effort of maintaining the CI server and fixing build breaks save time compared to a lengthier check-in process that attempts to never break the build? While much anecdotal evidence exists as to the benefits of CI there is very little in the way of data to support this. How do you convince teams and management that itpsilas worth adopting and how best to do it? This report outlines our experience with CI in a distributed team environment and attempts to answer these questions.
It's Not the Pants, it's the People in the Pants
  • D Goodman
  • M Elbaz
D. Goodman, M. Elbaz, "It's Not the Pants, it's the People in the Pants" Learnings from the Gap Agile Transformation What Worked, How We Did it, and What Still Puzzles Us, Agile 2008 Conference, Toronto, Canada, 2008.