ArticlePDF Available

Abstract

DevOps (a portmanteau of 'development' and 'operations') is a software development method that extends the agile philosophy to rapidly produce software products and services and to improve operations performance and quality assurance. It was born to accelerate the delivery of web-based systems and quickly bring new value to users. Many web-based systems evolve according to usage trends without a clear long-term goal. Before the widespread use of web services, most software with a clear goal were delivered as packages that users installed on their own system. New versions were delivered with a much lower frequency, with periods in between versions ranging from months to years. Development cycles were divided into large design, coding and testing phases culminating in the release of a new stable version. In software development in the context of applied science, even when the goal is clear, the process to attain it is not. Hence, working releases that capture the current software state must be released frequently in order to reduce the risks for all stakeholders and to make it possible to assess the current state of a project and steer it in the right direction. This paper explores the usefulness of DevOps concepts to improve the development of software that supports scientific projects. We establish the similarities and differences between scientific projects and web applications development, and discuss where the related methodologies need to be extended. Unique challenges are discussed herewith developed solutions, and still open questions. Lessons learned are highlighted as best practices to be followed in research projects. This discussion is rooted in our experience in real-life projects at the IBM Research Brazil Lab, which just as well apply to other research institutions.
ResearchOps: The Case for DevOps in Scientific
Applications
Maximilien de Bayser
IBM Research
Av. Pasteur 146 & 138, Botafogo,
Rio de Janeiro, RJ, Brazil – 2290-240
Email: mbayser@br.ibm.com
Leonardo G. Azevedo
IBM Research
Av. Pasteur 146 & 138, Botafogo,
Rio de Janeiro, RJ, Brazil – 2290-240
Email: lga@br.ibm.com
Renato Cerqueira
IBM Research
Av. Pasteur 146 & 138, Botafogo,
Rio de Janeiro, RJ, Brazil – 2290-240
Email: rcerq@br.ibm.com
Abstract—DevOps (a portmanteau of “development” and
“operations”) is a software development method that extends
the agile philosophy to rapidly produce software products and
services and to improve operations performance and quality
assurance. It was born to accelerate the delivery of web-based
systems and quickly bring new value to users. Many web-based
systems evolve according to usage trends without a clear long-
term goal. Before the widespread use of web services, most
software with a clear goal were delivered as packages that users
installed on their own system. New versions were delivered with a
much lower frequency, with periods in between versions ranging
from months to years. Development cycles were divided into large
design, coding and testing phases culminating in the release of
a new stable version. In software development in the context
of applied science, even when the goal is clear, the process to
attain it is not. Hence, working releases that capture the current
software state must be released frequently in order to reduce the
risks for all stakeholders and to make it possible to assess the
current state of a project and steer it in the right direction. This
paper explores the usefulness of DevOps concepts to improve
the development of software that supports scientific projects.
We establish the similarities and differences between scientific
projects and web applications development, and discuss where the
related methodologies need to be extended. Unique challenges are
discussed herewith developed solutions, and still open questions.
Lessons learned are highlighted as best practices to be followed
in research projects. This discussion is rooted in our experience
in real-life projects at the IBM Research Brazil Lab, which just
as well apply to other research institutions.
I. INTRODUCTION
Today the use of computing in science is pervasive. In
chemistry, physics, biology and many other areas, simulators
are used to conduct experiments in silico. Simulations are
much cheaper and faster to execute than conducting ex-
periments on real molecules or bacteria. In many projects,
computing is not only helpful but also necessary when the
amount of information that has to be processed is too large
for a human to handle in a reasonable amount of time. In
other projects, the software is the result of the research. For
instance, the Debater1is a project whose goal is to research
how to build systems that can argue on a human level.
The IBM Research - Brazil lab works on many research
fronts including fundamental physics, geomechanics, optimiza-
tion theory, machine learning, software engineering and many
1http://www.kurzweilai.net/introducing-a-new-feature-of-ibms-watson-the-
debater
others. Software artifacts play an essential role in our work and
most of the code that we develop involves highly specialized
knowledge that cannot be easily delegated to programmers.
This mean that the roles of scientist, developer, tester and
operations are somewhat condensed. In this scenario, the fun-
damental role of the research software engineers and computer
science PhDs of the lab is to help with our expertise to structure
the development and deployment. As more projects are com-
pleted, we need to identify reusable assets to streamline the
development of future projects. With a common infrastructure
and reusable modules, new projects can be handled in an
incremental way.
The software as the goal of the project, or as a tool to
attain theoretical results, captures essential knowledge about a
project. The mathematical models are explicitly represented as
programming language code. Assumptions about the input data
must be verified by the software. Unit tests contain expected
results for pre-defined input data. And, not least importantly,
the version control system tracks the evolution of knowledge
about the project’s subject.
The development of applied science projects is not a
random search of potential solutions. The problem space is
carefully mapped into known and unknown areas and inter-
mediate goals are established. For instance, for the Debater
project, there are a set of essential steps before searching
the evidences to support and build the arguments. Natural
language processing algorithms parse the sentence or question,
which includes, e.g., tokenization, part of speech tagging,
and entity detections. An intermediate goal would be topic
detection, where researchers study existing knowledge (in this
case, related to topic detection), investigate solutions already
proposed, eventually propose a solution that advances the state
of the art, and, finally, develop a software prototype that will
be a component of the final software. This component captures
the existing knowledge about topic detection, and developed
regression tests ensures it continues to be a sound base for the
rest of the project.
The formal language of the code has a strong relationship
to the state of the theory behind the project and, therefore,
the application of agile development practices must consider
the development of the scientific knowledge. Agile meth-
ods value collaboration, response time to changes, business
understanding, simplicity, and agility [1]. We believe that
the exploratory development and collaboration that are made
possible by fast prototyping are valuable to scientific projects
where new hypotheses must be tested and results review by
all participants.
The software is also where theoretical development must
meet pragmatical considerations such as the structuring in
reusable components, the choice of execution environment, and
the form of delivery to the stakeholders. As we will try to
argue in this paper, it is important to continuously deliver new
progress and, in this, the methodologies and tools of DevOps
are valuable.
The goal of this work is to explore the usefulness of
DevOps concepts to improve the development of software
that supports scientific projects. The paper discusses some
standard techniques, such as agile development, testing, system
configuration, version control and continuous integration, in
the context of scientific software development. We presents the
relevance of DevOps for scientific applications, thereby raising
various issues regarding software development and system
management for scientific applications in real-life projects. We
draw the current gap in general, and required (human and
technical) interfaces between scientific researchers (in different
fields, not only computer science) and software engineering
departments.
We conducted the following methodology. First, we present
the background in DevOps. We establish the similarities and
differences between scientific projects and web applications
development. Afterwards, we present the unique characteristics
of research projects at IBM Research – Brazil Lab that foster
the use of agile methods. Then, we discuss where the related
methodologies need to be extended. We present challenges and
solutions, and still open questions. Finally, we present best
practices to be followed in research projects.
We base our discussion on our experience with applied
research and DevOps principles at the IBM Research - Brazil
Lab. As many of the ongoing projects are confidential, we
will use examples of public projects that were also developed
at other IBM Labs.
The remainder of this work is divided as follows. Section II
presents DevOps main concepts (e.g., IaC - Infrastructure as
Code), motivations that foster its use, how the concept evolve,
examples of tools, and project experience. Section III presents
some characteristics about how agile applied research is con-
ducted in our lab. Section IV presents existing challenges,
solutions to solve them and still open questions to adopt
DevOps. Section V closes this report with a few conclusions
and recommendations.
II. DE VOPS
The original meaning of DevOps was a set of principles
that advocated the closer integration of developers and opera-
tors to reduce the friction that arose with new software releases.
Traditionally, the development of software was done in long
development cycles and when the software was deemed stable
enough in the development environment, it was handed over
to the operations team. However the production environment
could be very different, maybe even with a different operating
system, triggering bugs and delaying the deployment. Like
many terms, the meaning of DevOps has changed slightly
when it gained broader adoption and became a buzzword [2] .
In 2008, Dubois [3] discussed his experience with the
development and deployment of three different project and
his attempt to promote the use of Agile principles for infras-
tructure. He reported developers had little awareness of the
impact of their application on the deployment infrastructure,
which was shared among different applications. Infrastruc-
ture and operations was seen as completely orthogonal to
development. Conversely, since the operations team was not
aware of the planning and priorities of projects, they would
focus on changes in infrastructure that would often negatively
affect running applications. Besides, given that developers
are rewarded for new features and operations for stability,
developers and operations would see each other as negatively
affecting their work, leading to resentments.
In fact, Dubois and others realized that the operations teams
must participate in the planning of projects. This makes them
aware of the planned functionalities and their requirements on
the infrastructure and, at the same time, they can explain to
developers what the limitations of the shared infrastructure are.
It also allows to plan the entire life-cycle of new features,
from development to deployment. These ideas where further
refined and grouped with others under the name of DevOps,
a portmanteau of Development and Operations.
DevOps tries to improve the integrated planning and ex-
ecution of delivering new value with several guidelines: (i)
Make developers production-aware and vice-versa; (ii) Frame
developers and operators adopt the same toolset; (iii) Short
the release cycles so that the impact of releases is more
manageable; and, (iv) Promote tools to automate the setup of
environments to make them reproducible. DevOps aims at a
continuous pipeline of delivery where new features are auto-
matically tested in the correct environment and then approved
for production.
Although DevOps originally focused more on the culture
of development and operations, the idea of applying agile
principles to infrastructure has led many people to explore
what other elements of developer culture could be applied to
operations. Over the years, programmers have accumulated a
significant body of knowledge about what works to keep track
of changes and reuse code. Among those are automation of
tasks, domain specific languages (DSL), plain text files and
version control systems [4]. These principles were applied to
infrastructure and resulted in tools such as Chef 2[5], Puppet3
[6], SmartFrog4, and Vagrant5. Chef, for instance, allows one
to write infrastructure setup scripts using a declarative DSL
that is a subset of Ruby. These scripts describe what software
packages have to be installed on what nodes of an infrastruc-
ture and how they have to be configured. When combined
with virtual machine setup scripts, these infrastructure scripts
enable a user to create an entire virtual environment with a
single command.
This has led to the concept of Infrastructure as Code (IaC)
which mandates no manual configuration [7], which is difficult
to reproduce and do not scale. Instead, one should code the
entire infrastructure setup with IaC tools and put this code
2http://www.getchef.com/
3http://puppetlabs.com/
4http://www.smartfrog.org/
5http://www.vagrantup.com/
under version control. It results in a one-to-one mapping of
environment configuration to version control revisions. The
benefits are: (i) It is easier to revert to a previous configuration
if the changes introduce instability; (ii) Testing is improved;
(iii) Development and operation teams communication boost.
Developers can test their code in a realistic environment,
leveraged by virtualization. Operators can include the revision
identifier of an environment where a bug is observed, and help
developers to reproduce the exact conditions. DevOps tools
became so popular that it is now common to see the term
DevOps used to refer only to this subset of the original term
that comprises culture, practices and tool.
Another notable trend that lead to a broader adoption of
DevOps is the increase in software that is offered as a service
over the web. When software was delivered to clients as
packages of executables and data files to be installed on the
users’ computer, it made sense to minimize the amount of
releases and to group as much improvements as possible in a
single release. In web development, however, the deployment
of software is entirely under control of the organization that
develops it, because the servers where the application runs
are owned or rented. In the competitive environment of web
applications, it is critical to deploy new features as fast as
possible to production. To move a developer’s code change
committed to version control to production in a matter of days
or hours requires a well tuned collaboration of development
and operations.
Feitelson et al. [8] discuss their experience at Facebook
describing the engineering of a sophisticated pipeline of de-
livery of code to production that involves modern version
control, code reviews and deployment to a subset of servers
before full deployment. Most importantly their approach is
fundamentally based on creating the right development culture
more than on tools. An important characteristic of the nature
of Facebook is that the product is not specified in advance and
is continuously developed in response to usage trends, which
is another reason not to use a Waterfall planning but rather an
incremental planning model. Another highlight of their report
is the absence of a quality assurance (QA) team. They believe
that each developer should take personal responsibility for their
code and write the test cases. Since releases are incremental,
it is easier to test a feature with real users and revert to a
previous version if necessary.
The tighter integration of developers and operations, some-
what blurring the difference between those two roles has led
to many misconceptions. Some critics say that DevOps only
applies to start-ups that due to resource limitations have no
other choice than to make their developers take on the addi-
tional roles of system and database administrators, in addition
to testers, turning them into so-called full-stack developers.
Another criticism is that continuous delivery of features is
only relevant for web-based software, e.g., web applications
or smartphone apps. In the next section, we will justify why
in our experience continuous delivery is essential in many of
our projects and why we are constrained to have a culture of
personal responsibility where each participant must take care
of the entire implementation of a feature instead of having
different people taking over the specialized roles of developer,
tester and database administrator.
III. AGIL E APP LI ED RESEARCH
Most of the projects within IBM Research – Brazil Lab
are developed for external clients, many of them under the
form of joint development agreements (JDAs). These projects
usually have multi-year contracts where several goals may
be established. Finding out if these goals are possible to
accomplish or not is an essential part of the projects. Therefore,
there is an intrinsic risk in the project that has to be mitigated
by constant assessment of the current progress.
In JDAs, progress has to be visible at all times to our part-
ners’ scientists, i.e., they should be able to run the latest version
of available prototypes and contribute to the development of
theory and even the code. We conduct an approach which we
named as ResearchOps where the research, development and
operation work in a continuous cycle, i.e., the latest prototype
version are available to be used in practice. This process helps
to make clear where progress is being made rapidly and where
researchers are facing difficulties. If a research line is not
bringing good results, alternative approaches may be proposed.
The faster this process happens the better it is for the project.
For these reasons, short release cycles are essential. As in
web development, the goals are not very clear. Therefore, it is
essential having always a working version to run experiments.
Feitelson et al. report that at Facebook the development of new
features is guided by experience of how users are responding
to existing features. There are no clear requirements that
are established in advance. Similarly, when we develop a
simulator, e.g., for the mechanics of underground rock for-
mations, as soon as the geologists are able to run experiments,
they come up with new hypotheses to verify. This leads to
new feature requests that must be implemented as fast as
possible to confirm or discard these hypotheses. A version
control system that can effectively handle merges between
experimental branches and revert changes is instrumental to
make this process faster.
Another similarity with agile development as practiced at
Facebook [8] is that there is no separate quality assurance
(QA) team that tests the results produced by the researchers.
Each researcher tests his own code. This is not due to any
belief about how development and testing should be done
but a direct consequence of our kind of work. Train a tester
to work on our prototypes would require to teach concepts
like geomechanics, natural language processing, or any other
subject to make him able to create meaningful test cases. For
example, only an expert could generate a test case to verify
that a system correctly applies the physical laws that govern
the behavior of materials given a specific stress and strain state.
IV. EXPERIENCE ON DEVOPS APPLIED TO RESEARCH
This section presents the experience of the lab researchers
when applying DevOps on research projects. We discuss the
challenges and existing open questions found in infrastructure
setup. The challenges in testing scientific softwares aiming at
continuous integration are presented along with a methodology
proposed for testing, and questions still open. Since our lab
aims at deploying scientific software as a service, we also
discuss the challenges that arise in adopting this approach.
A. Challenges in infrastructure setup
In new web application projects, it is easy to use the latest
version of tools and libraries. There is a kind of convergence in
the ecosystem that tends to make the latest versions of Linux
distributions, programming language releases, server software
and front-end library compatible with each other. However, as
the project ages, its code and the code of libraries it depends on
become legacy code, consequently, the infrastructure becomes
increasingly hard to maintain.
In scientific computing many libraries have been developed
for decades in Fortran and C/C++. They are usually open-
source, but they do not have the number of maintainers that
other kinds of open-source projects have. In popular open-
source projects, there is a high probability that, for each
platform, there is at least one person that needs to run the code
on it and is willing to contribute the necessary code corrections.
Many scientific libraries are so specialized that only a few
researchers of a specific domain contribute to the code to solve
their specific needs. As a result, it is common to encounter a
situation where the dependencies of a library are not satisfied
by any package that is available in the repositories of the
particular Linux version required to be used. As an example,
it is quite challenging to install a new version of the PETSc6
package together with a new version of the g++7compiler that
supports the C++11 standard on a RHEL 58machine. This was
necessary in a numerical simulation project where the software
engineering researchers of the team also wanted to investigate
the use of non-intrusive dependency injection with a scripting
language in C++ [9].
Manually downloading, compiling and installing the entire
software stack for a project must be avoided at all costs.
Many of our researchers are not computer scientists and it is
difficult for them to go through this process. In general, they
proceed by trial and error because they do not know the best
practices, resulting in a waste of time and a lot of frustration.
The same goes for our clients. In many cases, they do not
have the freedom to choose their operating system. They have
workstations that are tightly controlled by corporate IT who
are extremely conservative, having them use very outdated
versions of Linux. It is critical for the client’s experience
that the installation of our prototypes on their machines is as
smooth as possible.
To address this issue of environment configuration we
invest a lot of time in automation. It is also very important
to replicate our client’s configuration in virtual environments
to run tests. One platform, usually the one that the client
is using, is chosen as the base for this automation. For
instance, if the client uses RHEL5 then, we use Vagrant to set
CentOs5 virtual machines that are binary compatible to test our
environment automation and the deployment and execution of
our software. We rely heavily on software package manager
[10] of Linux distributions. When a package for a specific
tool or library is not available, we create custom ones and
maintain a private repository for those packages. The benefits
are a quick installation, as the compilation is carried out only
once, tracking of dependencies and superior manageability.
6http://www.mcs.anl.gov/petsc/
7https://gcc.gnu.org/
8http://www.redhat.com/en/technologies/linux-platforms/enterprise-linux
We are also investigating the automation of more complex
environments, such as clusters. Traditional clusters are still
fundamental to run MPI9based workloads like meteorological
simulations. In a cluster environment, the package management
can be a bit different than on standard infrastructures. For
instance, in IBM’s Platform Cluster Manager10, the master
node keeps track of the software that is installed on the nodes.
Software packages are installed not directly to the node but
added to images in the master node. Each node is assigned
to an image and every time a node is activated it tries to
synchronize its configuration with the one of the image.
B. Open questions in infrastructure setup
Another topic of interest to us is the interaction of software
product lines (SPL) [11] with IaC. When several projects are
situated in the same problem domain or share very similar
architectures, it makes sense to architect the software for
reuse. In particular, different execution infrastructures could be
modeled as SPL variation points. Depending on the hardware,
numerical calculations can be carried out either by CPUs,
GPUs or other co-processors such as the Xeon Phi 11 or FPGAs
(Field Programmable Gate Arrays). The information about
the exact execution infrastructure would only be available at
deployment so the IaC scripts would have to select the correct
variation and apply device specific configurations, maybe even
recompile code to run optimally on the local system. In this
example, the variability of the software architecture would
be reflected directly in the IaC scripts. Extending the SPL
approach to IaC would allow us to approach the development
of infrastructure code in a more systematic and reusable way.
IaC together with virtualization is also transforming the
way how e-Science is done. Traditionally, due to the large
investment and maintenance costs of HPC resources, research
organizations share and pool these resources in large compu-
tational grids. Because they are based on voluntary collabo-
rations between multiple tenants, access to them is subject to
administrative and political issues. In addition, users cannot
simply install the software they need because it could disrupt
the work of other users. If an update of an installed library
happens, it could break a lot of applications that are already
deployed. The advent of commercial cloud computing has
brought a different model of large scale computing in which
users pay only for resources they actually use and they can
be scaled up or down according to current demands. In 2006
Childs et al. realized that they could implement replica grids
or grid testbed simulating physical machines with an entirely
virtual environment [12]. More recently several entirely cloud
based grids were proposed focusing on reduction of costs,
more flexibility, ease of use for researchers without a computer
science background and isolation between virtual grids [13]
[14]
However even with reduced complexity, the setup of iso-
lated, single tenant, virtual grids can still be difficult for
researchers without experience in cloud computing and sys-
tem administration. We believe that the use of IaC together
9http://www.mpi-forum.org/docs/
10http://www.ibm.com/systems/platformcomputing/products/
clustermanager/
11http://www.intel.com/content/www/us/en/processors/xeon/
xeon-phi-detail.html
with version control could significantly help novices to start
researching [15]. It is common for new PhD students to inherit
code that was developed by their advisor, older students and
collaborators. This code often lacks comments, documentation
or instructions on how to install or run it. If in addition to
the application code the new researchers also received the
IaC code, he would begin his work much faster. A similar
argument can also be made for open science. The progress of
science depends not only on the revision by peers, but also in
the reproduction of results. This is the way some researchers
believe the code used to obtain the result that is exposed in
a paper should also be made available. Hence, the application
code is only a part of the work. The infrastructure code is also
an important part that should be shared.
C. Challenges in testing scientific software for continuous
integration
An important difference of scientific to web development is
the issue of testing. A cornerstone of agile web development
is continuous integration based on unit tests. Tools such as
Jenkins12 automate the process of taking the latest version of
the software, compiling, running all regression tests and even
uploading approved releases automatically to the production
servers. In order to write an automated regression test, one
has to know what the result of the test should be. However, in
scientific project, usually no one knows in advance what exact
result a research prototype should give.
As an example, in physical simulation the results are often
quite complex and it is not feasible to calculate them by hand.
Even experts are not able to say with precision if a result
is right or wrong, they are able only to assess if the result is
consistent with the theory. In other words, there is a continuous
spectrum of results where the limits between right and wrong
are fuzzy. In cognitive systems, the situation can be even more
challenging. For instance, Watson [16] is a system that is able
to understand queries in natural language and give results also
in natural language, supported by a large body of knowledge.
It is built combining the output of many different machine
learning algorithms and each one of those is given a weight that
determines how much it will influence the final result. If a unit
test is written based on string matching, then it can fail when
even a small adjustment is made in one of the algorithms or
its weight. A good test would have to understand the response,
i.e., it would require an equally powerful system with natural
language processing and semantic matching just to test it.
D. Methodology and open research questions for testing
In software where the output is uncertain, the uncertainty
is a result of the complex interaction of many small deter-
ministic parts. For instance, numerical solvers always give the
same result for the exact same input. It is very important to
thoroughly test these deterministic but non-trivial parts to build
the confidence that the overall system is based on a sound
foundation. For this kind of components, we write regression
tests and configure our Jenkins server to execute them as soon
as new versions are submitted to the central version control
server.
12http://jenkins- ci.org/
In some cases, it can be challenging to test a prototype with
complete real-world input data. For instance, real petroleum
reservoir grid with all the description of materials, fractures
and faults can be too complex to derive significant conclusions.
We have found it very useful to write a testing framework
where scientist can generate testing input data using a declar-
ative notation.
For more complex integration tests another approach is
needed. Usually human expertise is necessary to verify if
the results are acceptable. In many areas, such as petroleum
engineering, the academic community publishes papers with
benchmark cases where they present their results and conclu-
sions. Other researchers take these benchmark cases, try to
reproduce them with their method and compare their results
with the previous one in new research papers. We try to find
these kinds of publications in our research area to reproduce
their results within our prototypes. A good result can be, e.g.,
the curve of a graph that we generate is compatible with the
curve in a published paper. Moreover, an expert is required to
do a qualitative analysis of previous results, the current results
and to assess if the system is behaving consistently. Hence,
simple unit tests cannot be written for it.
Once such a benchmark is verified, it is easy to write
integration tests where the output is compared for equality in
relation to previously saved results. However, such tests can
be very fragile. Consider what would happen if we refined the
model and used a more advanced equation to calculate some
physical quantity. The result would be correct but would fail
the unit test because it would be slightly different. An interest-
ing research direction would be to see how robust testing for
numerical simulations can be done with a statistical analysis
of the results. A framework such as Uncer tain < T > [17],
that explores the type system of languages to automatically
treat uncertain data with the appropriate statistical tools could
be used to write test that have flexibility but are not overly
permissive.
E. Challenges in the Deployment of Scientific Software as a
Service
Cloud computing is a model for enabling ubiquitous,
convenient, on-demand network access to a shared pool of
configurable computing resources (e.g., networks, servers, stor-
age, applications, and services) that can be rapidly provisioned
and released with minimal management effort or service
provider interaction [18], [19]. Cloud Computing has essential
characteristics (on-demand self-service, broad network access,
resource pooling, rapid elasticity, measured service), service
models (IaaS, PaaS, and SaaS), and deployment models of
Cloud infrastructure (private, community, public and hybrid)
that an organization needs to understand well in order to benefit
and improve its environment [18].
As with many other industries, there are several advantages
of moving the delivery of applied science projects to cloud
environments. The deployment is simplified because we decide
on which platform the system will run. Releasing new versions
of software is just a matter of publishing a new version
to the cloud. Storage and compute power is elastic and,
therefore, very attractive for clients who do not wish to invest
in computational infrastructure for compute or data intensive
systems. However, there are downsides, such as: in some cases,
large amounts of data have to be uploaded to the cloud servers;
offline work is not possible; and, as always, there is the concern
of security for confidential client data. This delivery model
brings us much closer to the kind usually associated with
DevOps.
IBM has both Infrastructure as a Service (IaaS ) and
Platform as a Service (PaaS) services. Considering the IaaS
service, SoftLayer13 offers both virtual and physical machines,
both with the elasticity of cloud computing, which is a
significant advantage for applications that have low latency
requirements. As an example, this IaaS was used during the
2014 soccer world cup, to host service developed by our lab
that processed millions of social network messages to provide
real time sentiment analysis and statistics for a smart phone app
produced by the largest media company in Brazil [20]. In the
PaaS service, BlueMix14 allows developers to upload specially
packaged applications without having to worry much about the
operating system or installed libraries. Currently, there is an
internal effort going on to publish several machine learning
projects to BlueMix to make them available as components
for other projects.
However, there are many challenges to port scientific
software to the cloud. Many are very compute intensive and use
numerical libraries written in Fortran, C and C++ and parallel
programming libraries such as MPI. MPI assumes that there
is a fixed number of machines that will be available from the
beginning to the end of the computation, which means that
it cannot take advantage of the elastic scaling on the cloud
during the computation. Depending on the structure and the
dependencies of the application, one can deploy it on a PaaS
where many orthogonal concerns are already taken care of.
Otherwise it must be deployed on a IaaS.
Developing scientific applications for PaaS can be quite
challenging because these platforms are usually more geared
towards web application development [21]. Web application
languages and frameworks usually emphasize speed of devel-
opment over speed of execution. These languages are often
scripting languages with dynamic typing or virtual machine-
based languages, such as Java or C#. Most of these languages
have a well developed module system and build tools that
can download required modules from open-source repositories
on deployment if required. Different from web applications,
scientific applications are mostly compiled to native machine
code, not only because of raw speed of execution, but also
because there are so many libraries and legacy code written in
languages such as C, C++ and Fortran that it is impossible
to translate everything to a new scripting language. These
languages have no well developed module systems and there-
fore the compilation and deployment of applications written in
these languages can be quite challenging. The lack of porta-
bility and the difficulty to implement tools such as Maven15,
or Node Package Modules16 (npm) for these languages, results
in very few PaaS having support for them.
IaaS gives the developer much more control, since he can
13http://www.softlayer.com
14http://bluemix.net/
15maven.apache.org
16https://www.npmjs.org/
choose what operating system will run on the virtual machine
he manages, and what libraries will be installed. Configuring
these virtual machines and deploying the software can be a
lot of work and if done manually, can quickly get out of hand
as the configuration evolves over time. There are many great
Infrastructure as Code (IaC) tools available that automate the
configuration of the computational infrastructure, like Chef and
Puppet. We have been developing a service on top of these
tools to ease the set up of virtual environments [15].
V. CONCLUSION AND LESSONS LEARNED
In this work, we presented our experience on developing
scientific software using novel approaches, like Agile, De-
vOps, IaC, Cloud Computing. We identified the commonalities
with agile web application development and delivery. Unique
challenges were discussed, the solutions we developed, and
the questions that are still open were presented. Software and
hardware setup difficulties can be very frustrating for the non-
computer scientists who just want to develop their research.
Some best practices that can help to reduce friction in the
development of scientific projects are:
Choose a “canonical” platform and make fully config-
ured virtual machines available;
Find or create packages that must be installed on this
platform;
Build extensive unit tests for deterministic compo-
nents;
Build declarative scripting tools that researchers can
use to develop tests;
Introduce distributed version control to researchers;
Continuously integrate changes to foster discussion
and fast validation of peers.
We have identified several challenges for which we have
no answers yet, but our experience report can foster more
discussion and research. The progress of science in many fields
depends increasingly on computers and we believe that much
remains to be done to make applied science more agile.
As future work, we propose to stress the findings system-
atically in a scientific software development project in order to
make explicit in a quantify fashion the benefits and challenges
of using the techniques and tools. Both works can be used
to develop a framework that categorizes challenges, questions
and best practices, and presents strategies and roadmap of
solutions in terms of methods, models, technologies, and tools,
to support DevOps-oriented scientific applications.
REFERENCES
[1] K. Beck, M. Beedle, A. Van Bennekum, A. Cockburn, W. Cunningham,
M. Fowler, J. Grenning, J. Highsmith, A. Hunt, R. Jeffries et al.,
“Agile manifesto,” 2011. [Online]. Available: http://agilemanifesto.org/
[2] M. H¨
uttermann, DevOps for Developers. Springer, 2012, vol. 1.
[3] P. Debois, “Agile infrastructure and operations: How infra-gile are you?”
in Agile, 2008. AGILE ’08. Conference, Aug 2008, pp. 202–207.
[4] A. Hunt and D. Thomas, The Pragmatic Programmer: From Jour-
neyman to Master. Boston, MA, USA: Addison-Wesley Longman
Publishing Co., Inc., 1999.
[5] M. Marschall, Chef Infrastructure Automation Cookbook. Packt
Publishing Ltd, 2013.
[6] J. Loope, Managing Infrastructure with Puppet. O’Reilly, 2011.
[7] D. Spinellis, “Don’t install software by hand,Software, IEEE, vol. 29,
no. 4, pp. 86–87, July 2012.
[8] D. Feitelson, E. Frachtenberg, and K. Beck, “Development and
deployment at facebook,” IEEE Internet Computing, vol. 17, no. 4, pp.
8–17, Jul. 2013. [Online]. Available: http://dx.doi.org/10.1109/MIC.
2013.25
[9] M. de Bayser and R. Cerqueira, “A system for runtime type
introspection in c++,” in Proceedings of the 16th Brazilian Conference
on Programming Languages, ser. SBLP’12, 2012. [Online]. Available:
http://dx.doi.org/10.1007/978-3-642-33182-4 9
[10] D. Spinellis, “Package management systems,” Software, IEEE, vol. 29,
no. 2, pp. 84–86, March 2012.
[11] P. C. Clements and L. M. Northrop, Software Product Lines: Practices
and Patterns. Addison-Wesley Longman Publishing Co., Inc., 2003.
[12] S. Childs, B. Coghlan, and J. McCandless, “Gridbuilder: A tool for
creating virtual grid testbeds,” in Proceedings of the International
Conference on e-Science and Grid Computing. IEEE Computer
Society, 2006, pp. 77–.
[13] R. Strijkers, W. Toorop, A. v. Hoof, P. Grosso, A. Belloum, D. Vasuin-
ing, C. d. Laat, and R. Meijer, “AMOS: Using the cloud for on-demand
execution of e-science applications,” in Proceedings of the International
Conference on e-Science. IEEE Computer Society, 2010, pp. 331–338.
[14] G. Singh, C. Kesselman, and E. Deelman, “Application-level resource
provisioning on the grid,” in Proceedings of the International Confer-
ence on e-Science and Grid Computing. IEEE Computer Society, 2006,
pp. 83–.
[15] M. P. M. de Bayser, L. G. Azevedo, L. P. Tizzei, and R. Cerqueira,
“A tool to support deployment of scientific software as a service,” in
Proceedings of Brazilian e-Science Workshop, 2014.
[16] E. Brown, “Watson: The jeopardy! challenge and beyond,” in Cognitive
Informatics Cognitive Computing (ICCI*CC), 2013 12th IEEE Interna-
tional Conference on, July 2013, pp. 2–2.
[17] J. Bornholt, T. Mytkowicz, and K. S. McKinley, “uncertain < t >:
A first-order type for uncertain data,” SIGARCH Comput. Archit. News,
2014. [Online]. Available: http://doi.acm.org/10.1145/2654822.2541958
[18] P. Mell and T. Grance, “The NIST Definition of Cloud Computing,”
NIST, Tech. Rep., 2011.
[19] R. Buyya, J. Broberg, and A. M. Goscinski, Cloud computing: Princi-
ples and paradigms. John Wiley & Sons, 2010, vol. 87.
[20] P. R. Cavalin, M. A. de Cerqueira Gatti, T. G. P. de Moraes, F. S.
Oliveira, C. Pinhanez, A. Rademaker, and R. A. de Paula, “A scalable
architecture for real-time analysis of microblogging data,” IBM Journal
of Research and Development, vol. TA, no. TA, p. TA, 2015.
[21] S. Jha, D. S. Katz, A. Luckow, A. Merzky, and K. Stamou,
Understanding Scientific Applications for Cloud Environments. John
Wiley & Sons, Inc., 2011, pp. 345–371. [Online]. Available:
http://dx.doi.org/10.1002/9780470940105.ch13
... The existing IT education methods in universities are often not following in step with the rapidly growing new functional and technical requirements in enterprises (Kuusinen, & Albertsen, 2019). DevOps, short for development and operations is a software development method that extends the agile philosophy to rapidly produce software products and services while improving operations performance and quality assurance (Leite et al., 2019;Bass et al., 2015;Bayser et al., 2015;Riungu-Kalliosaari et al., 2016). DevOps is defined as a set of practices intended to reduce the time between committing a change to a system and the change being placed into normal production while ensuring a high quality of code and its deployment which is required by today's enterprises (Bruel, & Jiménez, 2019;Ohtsuki, & Kakeshita, 2019;Ohtsuki et al., 2016). ...
Article
With the rapid advancements of Information Technology (IT) in enterprises, the existing IT educational methods in universities are struggling to meet the new functional and technical requirements. This paper offers an overview of DevOps principles and practices and proposes an effective method for augmenting IT education with these concepts. To achieve this, a conceptual framework needs to be developed where the principles and practice of DevOps can be easily embedded into existing IT education courses. In addition to this framework, an end-to-end pipeline and toolchains for efficient execution of selected IT education programs need to be developed. These pipelines must support the working culture and enhance the productivity of students, professors, and teaching boards while still being in line with the requirements of enterprises. The proposed model is developed from six aspects of DevOps which are continuous development, continuous integration, continuous testing, packaging and staging, release automation, and performance/relevance monitoring. Through the embedding of DevOps culture and practices into existing IT education, we expect to advance the relevance of the education-industry flow while providing a platform for continual learning and experimentation. The proposed framework was implemented and evaluated at two IT courses at the Faculty of Organizational Sciences, University of Belgrade.
... DevOps was defined by Andrej Dyck and Ralf Penners ganizational approach that stresses empathy and cross-functional collaboration within and between the development and operation teams in software development software development method that extends the agile philosophy to rapidly produce software products and services and to improve operations performance and quality assurance by Maximilien De Bayser in 2018 [10]. Not only that, an in-depth case study conducted in an organization which was having several ye DevOps argues, DevOps leads to great smartness for the Information Systems through the soft skills and pattern of collaboration of the software teams [5]. ...
... The integrated DevOps brings together both the development and operations teams and seems to address the bottleneck of slow releases of software into production environment [20,22]. The integration of DevOps is not a straightforward task and poses several technical and non-technical challenges [6]. This paper presents a case study of DevOps adoption in an Australian software-intensive organization (ABC -coded name due to privacy concerns) for the distributed agile development and deployment of a real-time high-performance gaming platform. ...
... • Result report: Points toward the end result of companies implementing DevOps culture in the deployment of APIs between private and public cloud components. • Influence: Deals with how DevOps culture improves the efficiency in the area of APIs development between the private architecture and cloud-based network architecture [12], there is importance of designing security criteria before the implementation of crafted template [13], or the impact of DevOps culture in the APIs development in specific cases of real-time computation of data research [14]. • Fundamentals: Covers the basic theory needed for cultivating DevOps culture in continuous delivery and data integration using the concept of the deployment pipeline [15]. ...
Article
Full-text available
Development life cycle involves writing scripts, testing, bug finding, recoding, documentation, and deployment, and this activity consumes more time. DevOps looks at automating the entire process. DevOps is a total software and template delivery system that gives importance to the interaction between operations and development fellows. Choosing the right technologies and crafting an IoT infrastructure need a sophisticated comprehensive methodology which can be difficult. DevOps consists of a set of people-oriented, technology-oriented, organizational methods, and tools that face challenges one after the other so that a smooth, easy, and flexible system is obtained. There are many STS practices involved in the development and crafting cloud-based IOT Mashups. We provide few theoretical points of the major IoT threats mentioned in the latest papers and a brief of research activity undertaken in DevOps world to overcome these demerits. This paper briefs the correlation between DevOps technical practices and the tools and engineering that enable uninterrupted integration and transmission, test mechanization, and quick implemen
Conference Paper
Full-text available
The use of 3-Dimensional (3D) printing, known as Digital fabrication (DF) or additive manufacturing (AM), technology in the food sector has countless potential to fabricate 3D constructs with complex geometries, customization, and on-demand production. For this reason, 3D technology is driving major innovations in the food industry. This paper presents the construction of a chocolate 3D printer by applying the pressure pump technique using chocolate as a printing material. Here the conventional 3D printer’s design was developed as a chocolate 3D printer. As an improvement, a new extruder mechanism was introduced. The extruder was developed to print the chocolate materials. In the working mechanism, the 3D printer reads the design instruction and chocolate material is extruding accordingly, through the nozzle of the pump to the bed of the 3D printer followed by the design (layer by layer). The special part of this chocolate 3D printer is the pressure pump in the extruder part. That pressure pump provides pressure on melted chocolate from the chocolate container to the nozzle point. The usability and efficiency of the 3D printer were tested with sample designs. The obtained results were presented and discussed. Together with these advances this 3D printer can be used to produce complex food models and design unique patterns in chocolate-based sweets by satisfying customers.
Chapter
The corporateOjo-Gonzalez, Karina organizational structure and culture should evolve in the development of software technology as the company grow;Prosper-Heredia, Rene otherwise,Dominguez-Quintero, Luis it will present weaknesses that hinder the implementation of any software methodology.Vargas-Lombardo, Miguel In this paper, the beneficial characteristics acquired by software products that have been developed under a DevOps conception have been identified, which has given DevOps a greater degree of relevance, promoting its study in greater detail, and assessing the growing interest of companies in having software as a quality that does not require additional costs for updating and maintenance. Additionally, the paper presents a systematic literature review, obtaining a conceptual schema of the relationship of DevOps with service-oriented software development, mitigating the lack of formalization of the concepts involved in each stage through an ontology. The result approximates a Model DevOps-based framework for SaaS development which can contribute as a guide for enterprises interested in the adoption of DevOps in their software production process for the first time.
Article
Full-text available
As events take place in the real world, e.g., sports games and marketing campaigns, people react and interact on online social networks (OSNs), especially microblog services such as Twitter, generating a large stream of data. Analyzing this data presents an opportunity for researchers and companies to better understand human behavior (both on the network and in real life) during the event's lifespan. Designing automated systems to conduct these analyses in fractions of minutes (or even seconds) is subjected to many challenges: the volume of data is large, the number of posts in future events cannot be predicted, and the system need to be always available and running smoothly to avoid information loss and delays on delivering the analytics results. In this paper, we present a scalable architecture for real-time analysis of microblogging data, with the ability to deal with large volumes of posts, by considering modular parallel workflows. This architecture, which has been implemented on the IBM InfoSphere Streams platform, was tested on a real-world use case to conduct sentiment analysis of Twitter posts during the games of the 2013 Fédération Internationale de Football Association (FIFA) Confederations Cup, and the system has successfully coped with the challenges of this task.
Article
Full-text available
Internet companies such as Facebook operate in a "perpetual development" mindset. This means that the website continues to undergo development with no predefined final objective, and that new developments are deployed so that users can enjoy them as soon as they're ready. To support this, Facebook uses both technical approaches such as peer review and extensive automated testing, and a culture of personal responsibility.
Book
DevOps for Developers delivers a practical, thorough introduction to approaches, processes and tools to foster collaboration between software development and operations. Efforts of Agile software development often end at the transition phase from development to operations. This book covers the delivery of software, this means the last mile, with lean practices for shipping the software to production and making it available to the end users, together with the integration of operations with earlier project phases (elaboration, construction, transition). DevOps for Developers describes how to streamline the software delivery process and improve the cycle time (that is the time from inception to delivery). It will enable you to deliver software faster, in better quality and more aligned with individual requirements and basic conditions. And above all, work that is aligned with the DevOps approach makes even more fun! Provides patterns and toolchains to integrate software development and operations Delivers an one-stop shop for kick-starting with DevOps Provides guidance how to streamline the software delivery process What youll learn Know what DevOps is and how it can result in better and faster delivered software Apply patterns to improve collaboration between development and operations Introduce unified processes and incentives to support shared goals Start with or extend a tool infrastructure that spans projects roles and phases Address pain points in your individual environment with appropriate recipes Break down existing walls that make up an unnecessarily sluggish delivery process Who this book is for DevOps for Developers is for motivated software engineers, particularly programmers, testers, QA, system admins, database admins, both beginners and experts, who want to improve their software delivery process. Its the perfect choice for engineers who want to go the next step by integrating their approaches for development and delivery of software. This book is for engineers who want to shape their processes and decide on and integrate open source tools and seek for guidance how to integrate standard tools in advanced real world use cases. Table of Contents Beginning DevOps for Developers Introducing DevOps Building Blocks of DevOps Quality and Testing Introduce Shared Incentives Gain Fast Feedback Unified and Holistic Approach Automatic Releasing Infrastructure as Code Specification by Example
Conference Paper
Many object-oriented languages support some kind of runtime introspection that allows programmers to navigate through meta-data describing the available classes, their attributes and methods. In general, the meta-data can be used to instantiate new objects, manipulate their attributes and call their methods. The meta-programming enabled by this kind of reflection has proven itself useful in a variety of applications such as object-relational mappings and inversion-of-control containers and test automation Motivated by the need of programmatic support for composition and configuration of software components at runtime, in this work we show how to implement a runtime reflection support for C++11, using the available runtime type information, template metaprogramming and source code analysis. We will show the capabilities of the reflection API and the memory footprint for different kinds of meta-data. The API relies on a few features introduced by C++11, the new ISO standard for C++. Our reflection system is not invasive as it requires no modifications whatsoever of the application code.
Conference Paper
Emerging applications increasingly use estimates such as sensor data (GPS), probabilistic models, machine learning, big data, and human data. Unfortunately, representing this uncertain data with discrete types (floats, integers, and booleans) encourages developers to pretend it is not probabilistic, which causes three types of uncertainty bugs. (1) Using estimates as facts ignores random error in estimates. (2) Computation compounds that error. (3) Boolean questions on probabilistic data induce false positives and negatives. This paper introduces UncertainT>, a new programming language abstraction for uncertain data. We implement a Bayesian network semantics for computation and conditionals that improves program correctness. The runtime uses sampling and hypothesis tests to evaluate computation and conditionals lazily and efficiently. We illustrate with sensor and machine learning applications that UncertainT> improves expressiveness and accuracy. Whereas previous probabilistic programming languages focus on experts, UncertainT> serves a wide range of developers. Experts still identify error distributions. However, both experts and application writers compute with distributions, improve estimates with domain knowledge, and ask questions with conditionals. The UncertainT> type system and operators encourage developers to expose and reason about uncertainty explicitly, controlling false positives and false negatives. These benefits make UncertainT> a compelling programming model for modern applications facing the challenge of uncertainty.
Conference Paper
form only given. Watson, named after IBM founder Thomas J. Watson, was built by a team of IBM researchers who set out to accomplish a grand challenge - build a computing system that rivals a human's ability to answer questions posed in natural language with speed, accuracy and confidence. The quiz show Jeopardy! provided the ultimate test of this technology because the game's clues involve analyzing subtle meaning, irony, riddles and other complexities of natural language in which humans excel and computers traditionally fail. Watson passed its first test on Jeopardy!, beating the show's two greatest champions in a televised exhibition match, but the real test will be in applying the underlying natural language processing and analytics technology in business and across industries. In this talk I will introduce the Jeopardy! grand challenge, present an overview of Watson and the DeepQA technology upon which Watson is built, and explore future applications of this technology.
Article
An IT system's setup and configuration affects developers mainly due to the proliferation and complexity of Internet-facing systems. Fortunately, we can control and conquer this complexity by adopting IT-system configuration management tools. By stipulating that all modifications to a system's configuration can only be performed through its configuration management system and by treating the system's rules as code, an organization ensures that the IT systems it delivers to its clients are not inscrutable monoliths that just happen to work but documented modular engines that work by design.