Content uploaded by Monica Chiarini Tremblay
Author content
All content in this area was uploaded by Monica Chiarini Tremblay on Aug 29, 2017
Content may be subject to copyright.
adfa, p. 1, 2011.
© Springer-Verlag Berlin Heidelberg 2011
DORA Platform: DevOps Assessment and Benchmarking
Nicole Forsgren1,3, Monica Chiarini Tremblay1, Debra VanderMeer1, Jez Humble2,3
1College of Business, Florida International University
nicolefv@gmail.com, tremblay,vanderd{@fiu.edu}
2DORA
Berkely
humble@berkeley.edu
1 Introduction
In today’s business environment, organizations are challenged with changing custom-
er demands and expectations, competitor pressures, regulatory environments, and
increasingly sophisticated outside threats at a faster rate than in years past. In order
for organizations to manage these challenges, they need the ability to deliver software
with both speed and stability. Yearly or even quarterly software releases are no longer
the norm. Organizations from technology (e.g., Etsy and Amazon) to retail (e.g.,
Nordstrom and Target) and others are using integrated software development and
delivery practices to deliver value to their users, beat their competitors to market, and
pivot when the market demands.
Given the complexity of the modern software and infrastructure landscape, technolo-
gy transformation is a non-trivial task. Software development and delivery include
several key capabilities: strong technical practices, decoupled architectures, lean man-
agement practices, and a trusting organizational culture. Technical teams and organi-
zations are left with an increasingly long list of potential capabilities to develop to
improve their ability to deliver software. Technology transformation is a portfolio
management problem, with technology leaders must allocate limited resources for to a
broad spectrum of potential areas for capability improvement to deliver the greatest
benefit.
Having a view to an organization’s current performance is key to any improvement
initiative. Formal assessments are one way to achieve this visibility. Assessments and
scorecards are not a new idea (e.g., CMMI [1]), but the industry has yet to find ways
to holistically measure and assess technology capabilities in ways that are based in
research and are repeatable, scalable, and offer industry benchmarks. Currently, most
commercial assessments consist of interviews conducted by a team of consultants [2].
These are heavyweight, expensive, not scalable, and subject to bias from the facilita-
tors (for example, a covert goal can be to try to sell software or continued consulting
services). Furthermore, by their nature, these assessments are myopic and only offer a
comparison within the firm. Most commercial assessments do not have external data;
research shows comparison benchmarks can drive performance improvements.
The DORA (DevOps Research and Assessment) platform presented in this paper
seeks to address these limitations. We do this in three stages. First, we build our as-
sessment on prior research that investigates capabilities and drives improvements in
the ability to develop and deliver software. Second, we refine our assessment on psy-
chometric methods that are statistically valid, reliable, and therefore consistent and
repeatable. Third, we build our platform on a SaaS model that is scalable and provides
industry-wide benchmarks.
2 Foundation
The traditional waterfall methodology treats development as a highly structured pro-
cess that inhibits rapid software development. As a methodology, it does not allow
developers to quickly respond to market and needs, nor to incorporate feedback from
discoveries during the delivery process. The Agile manifesto and related agile meth-
odologies emerged as an attempt to address limitations to traditional waterfall meth-
ods by leveraging feedback and embracing change through short incremental, iterative
delivery cycles. While agile methods address aspects of the challenges seen in tradi-
tional methodologies and help to speed up the development process, they are often
subject to limitations in the planning (i.e., upstream) and deployment (i.e., down-
stream) stages.
In response to these developments, the DevOps movement was started in the late
2000’s. One notable difference of DevOps from agile is its extension of agile process-
es beyond the development role downstream into IT operations. Additional differ-
ences include the application of lean manufacturing concepts, such as WIP limits and
visualization of work. Finally, and most importantly, DevOps highlights the im-
portance of communication across organizational boundaries and a high-trust culture
[3]. To maintain competitive advantage in the market, enterprise technology leaders
are undertaking technology transformations to move from traditional and agile meth-
odologies to DevOps methodologies that are continually improved.
Understanding where an organization currently performs (i.e., measuring and baselin-
ing performance) is important to any continuous improvement initiative (e.g., [4]).
Many organizations lack the instrumentation or expertise to measure their capabilities
in a holistic way, and therefore seek external assessment options.
The research to support this type of assessment exists, both in industry reports (e.g.,
[5,6,7]) and in academic papers (e.g., [3,8]). However, for several reasons, there cur-
rently are no direct ways for technology leaders to apply these findings to their organ-
izations with easy, scalable methods. First, current methods focus on qualitative ap-
proaches, which are not scalable and are not appropriate for comparison across time
periods and organizations. Second, few in industry understand how to apply behavior-
al psychometric models and methods; nor do they sufficiently understand analysis,
research design, and implementation requirements. Third, team members may not feel
safe reporting system and environment performance to internal leaders, but do feel
safer reporting to an external anonymized SaaS system [9]. Finally, the unavailability
of external benchmarking data to drive performance comparisons, and the inability to
measure improvement quantitatively over time in relation to changes in the rest of the
industry, prevents teams from understanding the dynamic wider context in which they
operate. This can lead to teams failing to take sufficiently strong action, and falling
further behind the industry over time.
3 The DORA assessment tool
To address the aforementioned challenges, the DORA assessment tool was designed
to target business leaders, either directly or indirectly (i.e., through channel partners
such as consultancies or system integrators, who can offer the assessment as part of a
larger engagement). The DORA assessment tool process is illustrated in Figure 1.
The DORA Assessment Platform is built using PHP and is a SaaS solution hosted in
AWS, with the data stored and processed in AWS East Region 1 with two independ-
ent availability zones for disaster recovery. The tool collects no personally identifia-
ble information (PII) and stores no IP addresses, making the assessment and analysis
appropriate for use in all geographies (for example, the assessment meets UK privacy
law guidelines).
Figure 1. Assessment Tool Process
3.1 Measurement Components
IT performance is comprised of four measurements: lead time for changes, deploy
frequency, mean time to restore (MTTR), and change fail rate. Lead time is how long
it takes an organization to go from code commit to code successfully running in pro-
duction or in a releasable state. Deploy frequency is how often code is deployed.
MTTR is long it generally takes to restore service when a service incident occurs
(e.g., unplanned outage, service impairment). Change fail percentage are the percent-
age of changes that result in degraded service or subsequently require remediation
(e.g., lead to service impairment, service outage, require a hotfix, fix forward, patch).
Key capabilities are measured among four main dimensions. The technical dimen-
sion includes practices that are important components of the continuous delivery par-
adigm, such as: the use of version control, test automation, deployment automation,
trunk-based development, and shifting left on security. The process dimension in-
cludes several ideas from lean manufacturing such as: visualization of work (such as
dashboards), decomposition of work (allowing for single piece flow), and work in
process limits. The measurement dimension includes the use of metrics to make busi-
ness decisions and the use of monitoring tools. And finally, the cultural dimension
includes measures of culture that are indicative of high trust and information flow, the
value of learning, and job satisfaction.
3.2 Survey Deployment
The DORA assessment surveys technologists along the full software product delivery
value stream (i.e., those in development, test, QA, IT operations, information security,
and product management). This is different from other assessments in that all tech-
nologists are polled and not just a handful, and practitioners on the ground are as-
sessed, not just leadership. The surveys include psychometric measures that capture
system and team behaviors along four key dimensions: technical, lean management,
monitoring, and cultural practices. Completing the survey takes approximately 20
minutes, and draws on prior work (see [2, 4, 5, 6, 7] for the latent constructs that are
referenced in the assessment).
The engagement model for the DORA assessment is one where technology leaders
can act on results analysis provided. This may mean looking to internal champions
and technical expertise, or it may mean engaging consultants to build out roadmaps
and act on the guidance provided. When running an assessment, a survey manager
meets with a client to determine the right sampling strategy for optimum data collec-
tion. The survey manager then partners with the client to send survey invitations to
the client teams, and the platform collects responses.
At the end of data collection, the responses are analyzed, and the reports are generat-
ed. These reports are sent to the client management team, and optionally, to the client
teams. The DORA assessment tool delivers the following: 1. Measurement of key
capabilities described above; 2. Benchmarking these key capabilities against their
own organization, the industry, and their aspirational peers in the industry; and 3.
Identification of priorities for high impact investments for capability development.
4 Case Study
We present a case study demonstrating the utility of the DORA platform. Fin500 is a
Fortune 500 company, and one of the ten largest banks in the United States. The com-
pany focuses on an innovative approach to customized services and offerings; this
innovative approach requires an ability to develop and deliver quality software with
rapidly. The Fin500 team was interested in the DORA assessment platform because
the various measurement and assessment tools they had been using were either too
narrow, too complicated, didn’t offer actionable insights, or didn’t show them how
they compared against the industry. Crucially, these other tools didn’t identify which
capabilities were the most important for them to focus on first. Only the DORA plat-
form provided a solution that provided all three things: holistic measurement, an in-
dustry benchmark, and identification of most important capabilities.
Following assessments across 17 teams and seven business units, DORA’s analysis
identified two key areas for capability development: automating change control pro-
cesses and trunk-based development. Trunk-based development is a coding practice
characterized by developers using one single mainline in a code repository; branches
have very short lifetimes before being merged into master; and application teams
rarely or never having “code lock” periods when no one can check in code or do pull
requests due to merging conflicts, code freezes, or stabilization phases.
While the team was aware that their change approval processes were a likely candi-
date for improvement, the analysis provided an evidence-based second opinion,
providing the necessary leverage to prioritize it. Trunk-based development proved to
be a bigger challenge: some were skeptical that this would be a key driver for IT per-
formance improvement. But the analysis was clear; these capabilities were key.
Fin500 created organization-wide working groups and workshops on branching strat-
egies and worked to reduce the amount of manual approvals happening in their
change approval processes. In just two months, the team was able to increase the
number of releases to production from 40 to over 800. Furthermore, this improvement
occurred with no increase in production incidents or outages.
The teams and their leadership also commented on the value of participating in the
assessment, since the survey itself highlights and reinforces behaviors and best prac-
tices across the dimensions described above. The DORA assessment becomes both a
measurement and a learning opportunity, creating a shared understanding of how to
drive improvement across the organization. A screencast of the DORA platform can
be seen here: http://bit.ly/2k5SYJW.
5 References
1. Team, C. P. (2010). CMMI® for Development, Version 1.3, Improving processes for de-
veloping better products and services. no. CMU/SEI-2010-TR-033. Software Engineering
Institute.
2. Shetty, Y. K. (1993). Aiming high: competitive benchmarking for superior performance.
Long Range Planning, 26(1), 39-44.
3. Forsgren, N., J. Humble (2016). “The Role of Continuous Delivery in IT and Organiza-
tional Performance.” In the Proceedings of the Western Decision Sciences Institute
(WDSI) 2016, Las Vegas, NV.
4. Shetty, Y. K. (1993). Aiming high: competitive benchmarking for superior perfor-
mance. Long Range Planning, 26(1), 39-44.
5. Forsgren Velasquez, N., Kim, G., Kersten, N., & Humble, J. (2014). 2014 State of DevOps
Report.
6. Puppet Labs and IT Revolution (2015). 2015 State of DevOps Report.
7. Brown, A., Forsgren, N., Humble, J., Kersten, N., & Kim, G. (2016). 2016 State of
DevOps Report.
8. Forsgren, N., J. Humble (2016). "DevOps: Profiles in ITSM Performance and Contributing
Factors." In the Proceedings of the Western Decision Sciences Institute (WDSI) 2016, Las
Vegas, NV
9. Lawler, E. E., Nadler, D., & Cammann, C. (1980). Organizational assessment: Perspec-
tives on the measurement of organizational behavior and the quality of work life. John
Wiley & Sons.