ArticlePDF Available

Nine Steps to Move Forward from Error



Following celebrated failures stakeholders begin to ask questions about how to improve the systems and processes they operate, manage or depend on. In this process it is easy to become stuck on the label ‘human error’ as if it were an explanation for what happened and as if such a diagnosis specified steps to improve. To guide stakeholders when celebrated failure or other developments create windows of opportunity for change and investment, this paper draws on generalizations from the research base about how complex systems fail and about how people contribute to safety and risk to provide a set of Nine Steps forward for constructive responses. The Nine Steps forward are described and explained in the form of series of maxims and corollaries that summarize general patterns about error and expertise, complexity and learning.
Nine Steps to Move Forward from Error
D. D. Woods
and R. I. Cook
Institute for Ergonomics, Ohio State University, Columbus, Ohio, USA;
Department of Anesthesia and Critical Care,
University of Chicago, Chicago, Illinois, USA
Abstract: Following celebrated failures stakeholders begin to ask questions about how to improve the systems and processes they operate, manage or depend
on. In this process it is easy to become stuck on the label ‘human error’ as if it were an explanation for what happened and as if such a diagnosis specified
steps to improve. To guide stakeholders when celebrated failure or other developments create windows of opportunity for change and investment, this paper
draws on generalizations from the research base about how complex systems fail and about how people contribute to safety and risk to provide a set of Nine
Steps forward for constructive responses. The Nine Steps forward are described and explained in the form of series of maxims and corollaries that summarize
general patterns about error and expertise, complexity and learning.
Keywords: Error; Failure in complex systems; Patient safety
Dramatic and celebrated failures are dreadful events that
lead stakeholders to question basic assumptions about how
the system in question works and sometimes breaks down.
As each of these systems is under pressure to achieve new
levels of performance and utilise costly resources more
efficiently, it is very difficult for these stakeholders in high-
risk industries to make substantial investments to improve
safety. In this context, common beliefs and fallacies about
human performance and about how systems fail undermine
the ability to move forward.
On the other hand, over the years researchers on human
performance, human–computer cooperation, teamwork,
and organisational dynamics have turned their attention
to high-risk systems studying how they fail and often
succeed. While there are many attempts to summarise these
research findings, stakeholders have a difficult time acting
on these lessons, especially as they conflict with conven-
tional views, require difficult trade-offs and demand
sacrifices on other practical dimensions.
In this paper we use generalisations from the research
base about how complex systems fail and how people
contribute to safety as a guide for stakeholders when
celebrated failure or other developments create windows of
opportunity for change and investment. Nine steps forward
are described and explained in the form of series of maxims
and corollaries that summarise general patterns about error
and expertise, complexity and learning. These ‘nine steps’
define one checklist for constructive responses when
windows of opportunity to improve safety arise:
1. Pursue second stories beneath the surface to discover
multiple contributors.
2. Escape the hindsight bias.
3. Understand work as performed at the sharp end of the
4. Search for systemic vulnerabilities.
5. Study how practice creates safety.
6. Search for underlying patterns.
7. Examine how change will produce new vulnerabilities
and paths to failure.
8. Use new technology to support and enhance human
9. Tame complexity through new forms of feedback.
When an issue breaks with safety at the centre, it has been
and will be told as a ‘first story’. First stories, biased by
knowledge of outcome, are overly simplified accounts of the
apparent ‘cause’ of the undesired outcome. The hindsight
bias narrows and distorts our view of practice after-the-fact.
As a result:
.there is premature closure on the set of contributors that
lead to failure;
Cognition, Technology & Work (2002) 4:137–144
Ownership and Copyright
#2002 Springer-Verlag London Limited
Technology &
.the pressures and dilemmas that drive human perfor-
mance are masked; and
.how people and organisations work to overcome hazards
and make safety is obscured.
Stripped of all the context, first stories are appealing
because they are easy to tell and locate the important
‘cause’ of failure in practitioners closest to the outcome.
First stories appear in the press and usually drive the public,
legal, and regulatory reactions to failure. Unfortunately,
first stories simplify the dilemmas, complexities, and
difficulties practitioners face and hide the multiple
contributors and deeper patterns. The distorted view leads
to proposals for ‘solutions’ that are weak or even counter-
productive and blocks the ability of organisations to learn
and improve.
For example, this pattern has been repeated over the last
few years as the patient safety movement in health care has
emerged. Each new celebrated failure produces general
apprehension and calls for action. The first stories convince
us that there are basic gaps in safety. They cause us to ask
questions like: ‘How big is this safety problem?’ ‘Why didn’t
someone notice it before?’ and ‘Who is responsible for this
state of affairs?’
The calls to action based on first stories have followed a
regular pattern:
.demands for increasing the general awareness of the issue
among the public, media, regulators and practitioners
(‘we need a conference . . .’);
.calls for others to try harder or be more careful (‘those
people should be more vigilant about . . .’);
.insistence that real progress on safety can be made easily
if some local limitation is overcome (‘we can do a better
job if only . . .’);
.calls for more extensive, more detailed, more frequent
and more complete reporting of problems (‘we need
mandatory incident reporting systems with penalties for
failure to report’); and
.calls for more technology to guard against erratic people
(‘we need computer order entry, bar coding, electronic
medical records, etc.’).
Actually, first stories represent a kind of reaction to failure
that attributes the cause of accidents to narrow proximal
factors, usually ‘human error’. They appear to be attractive
explanations for failure, but they lead to sterile responses
that limit learning and improvement (blame and punish-
ment; e.g., ‘we need to make it so costly for people that they
will have to . . .’).
When we observe this process begin to play out over an
issue or celebrated event, the constructive response is very
simple. To make progress on safety requires going beyond
first stories to discover what lies behind the term ‘human
error’ (Cook et al 1998). At the broadest level, our role is to
help others develop the deeper ‘second story’. This is the
most basic lesson from past research on how complex
systems fail. When one pursues second stories the system in
question looks very different and one can begin to see how
the system moves toward, but is usually blocked from,
accidents. Through these deeper insights learning occurs
and the process of improvement begins.
I. The Second Story Maxim
Progress on safety begins with uncovering ‘second stories’.
The remaining steps specify how to extract the second
stories and how they can lead to safety improvement.
The first story after celebrated accidents tells us nothing
about the factors that influence human performance before
the fact. Rather the first story represents how we, with
knowledge of outcome and as stakeholders, react to failures.
Reactions to failure are driven by the consequences of
failure for victims and other stakeholders and by the costs
associated with changes made to satisfy stakeholders that
the threats represented by the failure are under sufficient
control. This is a social and political process about how we
attribute ‘cause’ for dreadful and surprising breakdowns in
systems that we depend on (Woods et al 1994; Schon
Knowledge of outcome distorts our view of the nature of
practice. We simplify the dilemmas, complexities and
difficulties practitioners face and how they usually cope
with these factors to produce success. The distorted view
leads people to propose ‘solutions’ that actually can be
(a) if they degrade the flow of information that supports
learning about systemic vulnerabilities; and
(b) if they create new complexities to plague practice.
Research-based approaches fundamentally use various
techniques to escape from hindsight bias. This is a crucial
prerequisite for learning to occur.
When we start to pursue the ‘second story’, our attention is
directed to people working at the sharp end of a system
such as health care. The substance of the second story
resides at the sharp end of the system as organisational,
economic, human and technological factors play out to
create outcomes. Sharp end practitioners who work in this
setting face of a variety of difficulties, complexities,
dilemmas and trade-offs and are called on to achieve
D. D. Woods and R. I. Cook138
multiple, often conflicting, goals. Safety is created here at
the sharp end as practitioners interact with the hazardous
processes inherent in the field of activity in the face of the
multiple demands and using the available tools and
To follow second stories, one looks more broadly than a
single case to understand how practitioners at the sharp end
function – the nature of technical work as experienced by
the practitioner in context. This is seen in research as a
practice-centred view of technical work in context (Barley and
Orr 1997).
Ultimately, all efforts to improve safety will be
translated into new demands, constraints, tools or resources
that appear at the sharp end. Improving safety depends on
investing in resources that support practitioners in meeting
the demands and overcoming the inherent hazards in that
II. The Technical Work in Context Maxim
Progress on safety depends on understanding how practi-
tioners cope with the complexities of technical work.
When we shift our focus to technical work in context, we
begin to ask how people usually succeed. Ironically,
understanding the sources of failure begins with under-
standing how practitioners coordinate activities in ways
that help them cope with the different kinds of complex-
ities they experience. Interestingly, the fundamental insight
20 years ago that launched the New Look behind the label
human error was to see human performance at work as
human adaptations directed to cope with complexity
(Rasmussen 1986).
One way that some researchers have summarised the
results that lead to Maxim II is that:
‘The potential cost of misunderstanding technical work’ is the risk
of setting policies whose actual effects are ‘not only unintended but
sometimes so skewed that they exacerbate the problems they seek
to resolve’. ‘Efforts to reduce error misfire when they are predicated
on a fundamental misunderstanding of the primary sources of
failures in the field of practice [systemic vulnerabilities ] and on
misconceptions of what practitioners actually do.’ (Barley and Orr
1997, p. 18; emphasis added)
Three corollaries to the Technical Work in Context
Maxim can help focus efforts to understand technical
work as it effects the potential for failure:
Corollary IIA. Look for Sources of Success
To understand failure, understand success in the face of
Failures occur in situations that usually produce successful
outcomes. In most cases, the system produces success
despite opportunities to fail. To understand failure requires
understanding how practitioners usually achieve success in
the face of demands, difficulties, pressures and dilemmas.
Indeed, it is clear that success and failure flow from the
same sources (Rasmussen 1985).
Corollary IIB. Look for Difficult Problems
To understand failure, look at what makes problems
Understanding failure and success begins with under-
standing what makes problems difficult. Cook et al
(1998) illustrated the value of this approach in their
tutorial for health care, ‘The tale of two stories’. They used
three uncelebrated second stories from health care to show
progress depended on investigations that identified the
factors that made certain situations more difficult to handle
and then explored the individual and team strategies used
to handle these situations. As the researchers began to
understand what made certain kinds of problems difficult,
how expert strategies were tailored to these demands and
how other strategies were poor or brittle, new concepts
were identified to support and broaden the application of
successful strategies.
Corollary IIC. Be Practice-Centred – Avoid the Psycho-
logist’s Fallacy
Understand the nature of practice from the practitioner’s
point of view.
It is easy to commit what William James called over one
hundred years ago the Psychologist’s Fallacy (1890).
Updated to today, this fallacy occurs when well-intentioned
observers think that their distant view of the workplace
captures the actual experience of those who perform
technical work in context. Distant views can miss important
aspects of the actual work situation and thus can miss critical
factors that determine human performance in that field of
practice. To avoid the danger of this fallacy, cognitive
anthropologists use research techniques based on an ‘emic’
or practice-centred perspective (Hutchins, 1995). Research-
ers on human problem solving and decision making refer to
the same concept with labels such as process tracing and
naturalistic decision making (Klein et al 1993).
It is important to distinguish clearly that doing technical
work expertly is not the same thing as expert understanding
of the basis for technical work. This means that
practitioners’ descriptions of how they accomplish their
work are often biased and cannot be taken at face value.
For example, there can be a significant gap between
people’s descriptions (or self-analysis) of how they do
something and observations of what they actually do.
Since technical work in context is grounded in the
details of the domain itself, it is also insufficient to be
expert in human performance in general. Understanding
technical work in context requires (1) in-depth apprecia-
tion of the pressures and dilemmas practitioners face and
the resources and adaptations practitioners bring to bear to
Nine Steps to Move Forward from Error 139
accomplish their goals, and also (2) the ability to step back
and reflect on the deep structure of factors that influence
human performance in that setting. Individual observers
rarely possess all of the relevant skills, so that progress on
understanding technical work in context and the sources of
safety inevitably requires interdisciplinary cooperation.
In the final analysis, successful practice-centred inquiry
requires a marriage between the following three factors:
.the view of practitioners in context;
.technical knowledge in that area of practice; and
.knowledge of general results/concepts about the various
aspects of human performance that play out in that
Interdisciplinary collaborations have played a central role
as health care has begun to make progress on iatrogenic
risks and patient safety recently (e.g., Hendee 1999).
This leads us to note a third maxim:
III. The Interdisciplinary Synthesis Maxim
Progress on safety depends on facilitating interdisciplinary
Through practice-centred observation and studies of
technical work in context, safety is not found in a single
person, device or department of an organisation. Instead,
safety is created and sometimes broken in systems, not
individuals (Cook et al 2000). The issue is finding systemic
vulnerabilities, not flawed individuals.
IV. The Systems Maxim
Safety is an emergent property of systems and not of their
Examining technical work in context with safety as our
purpose, one will notice many hazards, complexities, gaps,
trade-offs, dilemmas and points where failure is possible.
One will also begin to see how practice has evolved to cope
with these kinds of complexities. After elucidating com-
plexities and coping strategies, one can examine how these
adaptations are limited, brittle and vulnerable to break-
down under differing circumstances. Discovering these
vulnerabilities and making them visible to the organisation
is crucial if we are to anticipate future failures and institute
change to head them off.
A repeated finding from research on complex systems is
that practitioners and organisations have opportunities to
recognise and react to threats to safety. Precursor events
may serve as unrecognised ‘dress rehearsals’ for future
accidents. The accident itself often evolves through time so
that practitioners can intervene to prevent negative
outcomes or to reduce their consequences. Doing this
depends on being able to recognise accidents-in-the-
making. However, it is difficult to act on information
about systemic vulnerabilities as potential interventions
often require sacrificing some goals under certain circum-
stances (e.g., productivity) and therefore generate conflicts
within the organisation.
Detection and recovery from incipient failures is a
crucial part of achieving safety at all levels of an
organisation – a corollary to the Systems Maxim. Successful
individuals, groups and organisations, from a safety point of
view, learn about complexities and the limits of current
adaptations and then have mechanisms to act on what is
learned, despite the implications for other goals (Rochlin
1999; Weick and Roberts 1993).
Corollary IVA. Detection and Recovery Are Critical to
Understand how the system of interest supports (or fails to
support) detection and recovery from incipient failures.
In addition, this process of feedback, learning and
adaptation should go on continuously across all levels of
an organisation. With change, some vulnerabilities decay
while new paths to failure emerge. To track the shifting
pattern requires getting information about the effects of
change on sharp end practice and about new kinds of
incidents that begin to emerge. If the information is rich
enough and fresh enough, it is possible to forecast future
forms of failure, to share schemes to secure success in the
face of changing vulnerabilities. Producing and widely
sharing this sort of information may be one of the hallmarks
of a culture of safety (Weick et al. 1999).
However, establishing a flow of information about
systemic vulnerabilities is quite difficult because it is
frightening to consider how all of us, as part of the
system of interest, can fail. Repeatedly, research notes that
blame and punishment will drive this critical information
underground. Without a safety culture, systemic vulner-
abilities become visible only after catastrophic accidents. In
the aftermath of accidents, learning also is limited because
the consequences provoke first stories, simplistic attribu-
tions and shortsighted fixes.
Understanding the ‘systems’ part of safety involves
understanding how the system itself learns about safety
and responds to threats and opportunities. In organisational
safety cultures, this activity is prominent, sustained and
highly valued (Cook 1999). The learning processes must be
tuned to the future to recognise and compensate for
negative side effects of change and to monitor the changing
landscape of potential paths to failure. Thus, the Systems
Maxim leads to the corollary to examine how the
organisation at different levels of analysis supports or fails
to support the process of feedback, learning and adaptation.
D. D. Woods and R. I. Cook140
Corollary IVB. Learning how to Learn
Safe organisations deliberately search for and learn about
systemic vulnerabilities.
The future culture all aspire to is one where stakeholders
can learn together about systemic vulnerabilities and work
together to address those vulnerabilities, before celebrated
failures occur (Woods, 2000).
Typically, reactions to failure assume the system is ‘safe’ (or
has been made safe) inherently and that overt failures are
only the mark of an unreliable component. But what is
irreducible is uncertainty about the future, change and finite
resources. As a result, all systems confront inherent hazards,
trade-offs and are vulnerable to failure. Second stories reveal
how practice is organised to allow practitioners to create
success in the face of threats. Individuals, teams and
organisations are aware of hazards and adapt their practices
and tools to guard against or defuse these threats to safety. It
is these efforts that ‘make safety’. This view of the human
role in safety has been a part of complex systems research
since its origins (see Rasmussen et al 1994, ch. 6). The
Technical Work in Context maxim tell us to study how
practice copes with hazards and resolves trade-offs, for the
most part succeeding yet in some situations failing.
However, the adaptations of individuals, teams and
organisations can be limited or stale so that feedback about
how well adaptations are working or about how the
environment is changing is critical. Examining the
weaknesses and strengths, costs and benefits of these
adaptations points to the areas ripe for improvement. As
a result, progress depends on studying how practice creates
safety in the face of challenges – expertise in context (Feltovich
et al 1997; Klein, 1998).
In the discussions of some particular episode or ‘hot button’
issue it is easy for commentators to examine only surface
characteristics of the area in question. Progress has come
from going beyond the surface descriptions (the phenotypes
of failures) to discover underlying patterns of systemic
factors (genotypical patterns; see Hollnagel 1993; 1998).
V. The Genotypes Maxim
Progress on safety comes from going beyond the surface
descriptions (the phenotypes of failures) to discover under-
lying patterns of systemic factors (genotypical patterns).
Genotypes are concepts and models about how people,
teams and organisations coordinate information and
activities to handle evolving situations and cope with the
complexities of that work domain. These underlying
patterns are not simply about knowledge of one area in a
particular field of practice. Rather, they apply, test and
extend knowledge about how people contribute to safety
and failure and how complex systems fail by addressing the
factors at work in this particular setting. As a result, when
we examine technical work, search for underlying patterns
by contrasting sets of cases.
As capabilities, tools, organisations and economic pressures
change, vulnerabilities to failure change as well.
VI. Safety is a Dynamic Process Maxim
The state of safety in any system always is dynamic.
Systems exist in a changing world. The environment,
organisation, economics, capabilities, technology, manage-
ment and regulatory context all change over time. This
backdrop of continuous systemic change ensures that
hazards and how they are managed are constantly changing.
Plus, the basic pattern in complex systems is a drift toward
failure as planned defences erode in the face of production
pressures and change. As a result, when we examine
technical work in context, we need to understand how
economic, organisational and technological change can
create new vulnerabilities in spite of or in addition to
providing new benefits.
Research reveals that organisations that manage poten-
tially hazardous technical operations remarkably success-
fully create safety by anticipating and planning for
unexpected events and future surprises. These organisations
did not take past success as a reason for confidence. Instead
they continued to invest in anticipating the changing
potential for failure because of the deeply held under-
standing that their knowledge base was fragile in the face of
the hazards inherent in their work and the changes
omnipresent in their environment (Rochlin 1999).
Research results have pointed to several corollaries to
the Dynamic Process Maxim.
Corollary VIA. Law of Stretched Systems
Under resource pressure, the benefits of change are taken in
increased productivity, pushing the system back to the edge
of the performance envelope.
Change occurs to improve systems. However, because the
system is under resource and performance pressures from
Nine Steps to Move Forward from Error 141
stakeholders, we tend to take the benefits of change in the
form of increased productivity and efficiency and not in the
form of a more resilient, robust and therefore safer system
(Rasmussen 1986). Researchers in the field speak of this
observation as follows: systems under pressure move back to
the ‘edge of the performance envelope’ or the Law of
Stretched Systems (Woods 2002):
. . . we are talking about a law of systems development, which is
every system operates, always at its capacity. As soon as there is
some improvement, some new technology, we stretch it . . .
(Hirschhorn 1997)
Change under resource and performance pressures tends
to increase coupling, that is, the interconnections
between parts and activities, in order to achieve greater
efficiency and productivity. However, research has found
that increasing coupling also increases operational com-
plexity and increases the difficulty of the problems
practitioners can face. Jens Rasmussen (1986) and Charles
Perrow (1984) provided some of the first accounts of the
role of coupling and complexity in modern system
Corollary VIB. Increasing Coupling Increases Com-
Increased coupling creates new cognitive and collaborative
demands and new forms of failure.
Increasing the coupling between parts in a process changes
how problems manifest, creating or increasing complexities
such as more effects at a distance, more and faster cascades
of effects, tighter goal conflicts, more latent factors. As a
result, increased coupling between parts creates new
cognitive and collaborative demands which contribute to
new forms of failure (Woods 1988; Woods and Patterson
Because all organisations are resource limited to one
degree or another, we are often concerned with how to
prioritise issues related to safety. The Dynamics Process
Maxim suggests that we should consider focusing our
resources on anticipating how economic, organisational
and technological change could create new vulnerabil-
ities and paths to failure. Armed with this knowledge we
can address or eliminate these new vulnerabilities at a
time when intervention is less difficult and less expen-
sive (because the system is already in the process of
change). In addition, these points of change are at the
same time opportunities to learn how the system actually
VII. The Window of Opportunity Maxim
Use periods of change as windows of opportunity to
anticipate and treat new systemic vulnerabilities.
The notion that it is easy to get ‘substantial gains’ through
computerisation is common in many fields. The implication
is that computerisation by itself reduces human error and
system breakdown. Any difficulties that are raised about the
computerisation process become mere details to be worked
out later.
VIII. Joint Systems Maxim
But this idea, which Woods stated a long time ago as ‘a
little more technology will be enough’, has not turned out
to be the case in practice (for an overview see Woods et
al 1994, ch. 5 or Woods and Tinapple 1999). Those pesky
details turn out to be critical in whether the computerisa-
tion creates new forms of failure. New technology can
help and can hurt, often at the same time depending on
how the technology is used to support technical work in
Basically, it is the underlying complexity of operations
that contributes to the human performance problems.
Improper computerisation can simply exacerbate or create
new forms of complexity to plague operations. The
situation is complicated by the fact the new technology
often has benefits at the same time that it creates new
People and computers are not separate and independent,
but are interwoven into a distributed system that performs
cognitive work in context.
The key to skilful as opposed to clumsy use of
technological possibilities lies in understanding the factors
that lead to expert performance and the factors that
challenge expert performance. The irony is that once we
understand the factors that contribute to expertise and to
breakdown, we then will understand how to use the powers
of the computer to enhance expertise. This is illustrated in
uncelebrated second stories in research on human perfor-
mance in medicine, explored in Cook et al (1998). On the
one hand, new technology creates new dilemmas and
demands new judgments, but, on the other hand, once the
basis for human expertise and the threats to that expertise
had been studied, technology was an important means to
the end of enhanced system performance.
We can achieve substantial gains by understanding the
factors that lead to expert performance and the factors that
challenge expert performance. This provides the basis to
change the system, for example, through new computer
support systems and other ways to enhance expertise in
As a result, when we examine technical work, understand
the sources of and challenges to expertise in context. This is
D. D. Woods and R. I. Cook142
crucial to guide the skilful, as opposed to clumsy use of
technological possibilities.
Corollary VIIIA. There is no Neutral in Design
In design, we either support or hobble people’s natural
ability to express forms of expertise (Woods 2002).
The theme that leaps out from past results is that failure
represents breakdowns in adaptations directed at coping with
complexity. Success relates to organisations, groups and
individuals who are skilful at recognising the need to adapt
in a changing, variable world and in developing ways to
adapt plans to meet these changing conditions despite the
risk of negative side effects.
Recovery before negative consequences occur, adapting
plans to handle variations and surprise, and recognising side
effects of change are all critical to high resilience in human
and organisational performance. Yet, all of these processes
depend fundamentally on the ability to see the emerging
effects of decisions, actions, policies – feedback, especially
feedback about the future. In general, increasing complex-
ity can be balanced with improved feedback. Improving
feedback is a critical investment area for improving human
performance and guarding against paths toward failure. The
constructive response to issues on safety is to study where
and how to invest in better feedback.
This is a complicated subject since better feedback is
.integrated to capture relationships and patterns, not
simply a large set of available data elements;
.event based to capture change and sequence, not simply
the current values on each data channel;
.future oriented to help people assess what could happen
next, not simply what has happened;
.context sensitive and tuned to the interests and
expectations of the monitor.
Feedback at all levels of the organisation is critical because
the basic pattern in complex systems is a drift toward failure
as planned defences erode in the face of production pressures
and change. The feedback is needed to support adaptation
and learning processes. Ironically, feedback must be tuned
to the future to detect the emergence of the drift toward
failure pattern, to explore and compensate for negative side
effects of change, and to monitor the changing landscape of
potential paths to failure. To achieve this organisations
need to develop and support mechanisms that create foresight
about the changing shape of risks, before anyone is injured.
Barley S, Orr J (eds) (1997). Between craft and science: technical work in
US settings. IRL Press, Ithaca, NY.
Cook RI (1999). Two years Before the Mast: Learning How to Learn about
Patient Safety. In W. Hendee, (ed.), Enhancing Patient Safety and
Reducing Errors in Health Care. National Patient Safety Foundation,
Chicago IL.
Cook RI, Woods DD, Miller C. A tale of two stories: contrasting views on
patient safety. National Patient Safety Foundation, Chicago IL, April
1998 (available at
Cook RI, Render M, Woods DD (2000). Gaps: learning how practitioners
create safety. British Medical Journal 320:791–794.
Feltovich P, Ford K, Hoffman R (eds) (1997). Expertise in context. MIT
Press, Cambridge MA.
Hendee W (ed) (1999). Enhancing patient safety and reducing errors in
health care. National Patient Safety Foundation, Chicago, IL.
Hirschhorn L (1997). Quoted in Cook RI, Woods DD and Miller C
(1998). A Tale of Two Stories: Contrasting Views on Patient Safety.
National Patient Safety Foundation, Chicago IL, April 1998.
Hollnagel E (1993). Human reliability analysis: context and control.
Academic Press, London.
Hollnagel E (1998). Cognitive reliability method and error analysis
method. Elsevier, New York.
Hutchins E (1995). Cognition in the wild. MIT Press, Cambridge,
James W (1890). Principles of psychology. H. Holt & Co. NY.
Klein G (1998). Sources of power: how people make decisions. MIT Press,
Cambridge, MA.
Klein GA, Orasanu J, Calderwood R (eds) (1993). Decision making in
action: models and methods. Ablex, Norwood, NJ.
Perrow C (1984). Normal accidents. Basic Books, NY.
Rasmussen J (1985). Trends in human reliability analysis. Ergonomics
Rasmussen J (1986). Information processing and human–machine
interaction: an approach to cognitive engineering. North-Holland,
New York.
Rasmussen J, Pejtersen AM, Goodstein LP. At the periphery of effective
coupling: human error. In Cognitive systems engineering. Wiley, New
York, pp 135–159.
Rochlin GI (1999). Safe operation as a social construct. Ergonomics
Schon DA (1995). Causality and causal inference in the study of
organizations. In Goodman RF, Fisher WR (eds). Rethinking knowl-
edge: reflections across the disciplines. State University of New York
Press, Albany, pp 000–000.
Weick KE, Roberts KH (1993). Collective mind and organizational
reliability: the case of flight operations on an aircraft carrier deck.
Administration Science Quarterly 38:357–381.
Weick KE, Sutcliffe KM, Obstfeld D (1999). Organizing for high
reliability: processes of collective mindfulness. Research in Organiza-
tional Behavior 21:81–123.
Woods DD (2000). Behind human error: human factors research to
improve patient safety. In National summit on medical errors and
patient safety research, Quality Interagency Coordination Task Force
and Agency for Healthcare Research and Quality, 11 September
Woods DD (2002). Steering the Reverberations of Technology Change on
Fields of Practice: Laws that Govern Cognitive Work. In Proceedings of
the 24th Annual Meeting of the Cognitive Science Society, August
2002. [Plenary Address].
Woods DD, Johannesen L, Cook RI, Sarter N (1994). Behind human error:
cognitive systems, computers and hindsight. Crew Systems Ergonomic
Information and Analysis Center, WPAFB, Dayton OH,1994(at http://
Woods DD (1988). Coping with complexity: the psychology of human
behavior in complex systems. In Goodstein LP, Andersen HB, Olsen SE
Nine Steps to Move Forward from Error 143
(eds). Mental models, tasks and errors. Taylor & Francis, London, pp
Woods DD, Patterson ES (2002). How unexpected events produce an
escalation of cognitive and coordinative demands. In Hancock PA,
Desmond P (eds). Stress workload and fatigue. Erlbaum L, Hillsdale, NJ
(in press).
Woods DD, Tinapple D (1999).W
: watching human factors watch people
at work. Presidential address, 43rd annual meeting of the Human Factors
and Ergonomics Society, 28 September 1999 (multimedia production at
Correspondence and offprint requests to: D. D. Woods, Cognitive Systems
Engineering Laboratory, Department of Industrial and Systems Engineer-
ing, Ohio State University, 1971 Neil Avenue, Columbus, OH 43210,
USA. Email:
D. D. Woods and R. I. Cook144
... Aligned to this is the continued culture of fear around reporting of mistakes or errors made, given the person-centered blame culture that Leape and Berwick (2005) and more recently, Holden (2009) maintains still very much a part of most industries, including aviation and healthcare. In response to this, there is still a need for the development of effective and appropriate reporting and learning systems [18,19], which, if introduced alongside a just culture, may play an important role in identifying systemic weaknesses, which Woods and Cook (2002) argue is a more effective method of recovering from errors than identifying problematic or "flawed humans" (p. 140). ...
... In contrast, as highlighted by the IOM and other authors [22,23] errors can be better understood by taking a systems approach or view. This holds that safety is an emergent property of the way in which a system is designed and not a product of the action of its individual components [21,24]. From this perspective, errors which occur at the sharp end, are the result of a host of latent systemic conditions or design flaws, or what Reason refers to as "resident pathogens" (2000; p. 769) and active failures of people while performing their work. ...
... Therefore, it is not necessarily the human who causes the error (no matter the context) but rather the human's interactions with the broader system (the tools, tasks, environment, other people in a certain organizational framework and context) which, if the system has latent failures, result in the occurrence of error. Woods and Cook (2002) stress that in order to recover from error there is a need to search for systemic vulnerabilities, while understanding work as it is performed at the sharp end. This enables the detection of latent failures within the design of the system by those who operate within in it, a critical step to informing decision-makers on what needs to be prioritized to improve safety and reduce the likelihood of the same thing happening again. ...
Full-text available
The increasing complexity and dynamicity of our society (and world of work) have meant that healthcare systems have and continue to change and consequently the state of healthcare systems continues to assume different characteristics. The causes of mortality are an excellent example of this rapid transformation: non-communicable diseases have become the leading cause of death, according to World Health Organization (WHO) data, but at the same time there are new problems emerging such as infectious diseases, like Ebola or some forms of influenza, which occur unexpectedly or without advanced warning. Many of these new diseases diffuse rapidly through the different parts of the globe due to the increasingly interconnected nature of the world. Another example of the healthcare transformation is the innovation associated with the introduction and development of advanced communication and technology systems (such as minimally invasive surgery and robotics, transplantation, automated antiblastic preparation) at all levels of care. Consequently, the social and technical dimensions of healthcare are becoming more and more complex and provide a significant challenge for all the stakeholders in the system to make sense of and ensure high quality healthcare. These stakeholders include but are not limited to patients and their families, caregivers, clinicians, managers, policymakers, regulators, and politicians. It is an inescapable truth that Humans are always going to be part of the healthcare systems, and it is these human, who by their very nature introduce variability and complexity to the system (we do not necessarily view this as a negative and this chapter will illustrate). A microlevel a central relationship in focus is that between the clinician and the patient, two human beings, making the health system a very peculiar organization compared to similarly high-risk organizations such as aviation or nuclear energy. This double human being system [1] requires significant effort (good design) in managing unpredictability through the development of personal and organization skills, such as the ability to react positively and rapidly to unexpected events and to adopt a resilient strategy for survival and advancement. In contrast to other similar industries, in terms of level of risk and system safety, healthcare settings are still plagued by numerous errors and negative events involving humans (and other elements) at various levels within the system. The emotional involvement is very high due to the exposure to social relationships daily and results in significant challenges to address both technical and non-technical issues simultaneously.
... Most of the risk assessment methods are established on a linear approach and, thereby, cannot cover the sociotechnical systems complexity. These methods assume that the structure of a system and its behavior do not change over time (Papadopoulos, Georgiadou, Papazoglou, & Michaliou, 2010;Woods & Cook, 2002). As a result, such methods do not provide efficient ways to identify and assess emerging risks in the everchanging construction environment based on the actual execution of daily tasks. ...
Full-text available
Water reservoirs are considered very important storage facilities for the condition of peak demand in the agriculture industry. The construction process of a water reservoir may be accompanied by multiple risks. This study is aimed at identifying the emerging risks resulting from variability in different functions and at prioritizing them based on the analytical hierarchy process (AHP) method. In fact, the potential variability (emerging risks), possible dependencies/couplings, and the barriers used for damping this variability were assessed using Functional Resonance Analysis Method (FRAM). The AHP method was then applied to prioritize the various functions. The results of FRAM modeling indicated that there was the possibility of a high variability in five functions. In this context, the AHP findings showed that “initial studies on the construction of the pool” with the weight of 0.310 and “excavation” with the weight of 0.308 were the most important functions in this study. In addition, the results of this study demonstrated that AHP may be a desired alternative for the identification of performance variability and the aggregation of variability.
... Resilience in safety emphasizes proactive "living" and "elastic" processes that can cope with surprise and unexpected events, rather than traditional reactive defenses (Hollnagel et al., 2006). Achieving resilient operations begins with understanding complexity by creating new attitudes and processes that are 'failure sensitive' and that adapt to commercial pressures even with fiscal constraints (Woods & Cook, 2002). Resilience means recognizing the non-linear potentiality of the hazard and absorbing or deflecting its impact so that operations continue with minimal disruption. ...
... Even a short-term productivity gain could be detrimental to medium-term missions and sustainable performance as the primary goal. Over the time, organisations create the illusion that "failure can't happen there" (Woods and Cook, 2002). The adaptive innovation is inhibited if the organisation feels the threat of impending crisis. ...
Full-text available
The paper discusses the impact of the COVID-19 pandemic on the Italian chemical and process industries, where Directive 2012/18/EU Seveso III, for the control of Major Accident Hazard (MAH) is enforced. The Safety Management System (SMS) for the control of MAH, which has been mandatory for 20 years in Italian Seveso Establishments, has been highly stressed by the external pressure, related in some way to the COVID-19 pandemic. Fairly, most companies, in particular in oil and gas sectors, have demonstrated an adequate capability to reconcile operation continuity and health requirements. This experience is providing the establishment operators and the regulators with valuable suggestions for the improvements of the SMS-MAH. Within this framework, an innovative organisational resilience model is proposed, aiming at the development of a higher capability to face future new crisis. The current SMS-MAH already includes some basic pillars to enhance resilience, which were valuable during pandemic crisis, but a full and rationale development is still needed. Starting from the first pandemic phase experience, this paper presents a novel tool to assess the degree of “resilience” of a SMS-MAH. It is based on a questionnaire, featuring 25 questions grouped into eight items, according to the typical SMS-MAH structure. A two level AHP model has been developed, in order to define the weights to be assigned to each point. The AHP panel included industrial practitioners, regulators, authorities and researchers. The results are based on the COVID-19 experience and consequently the developed model is tailored to face health emergencies, but the approach may be easily transferred to other external crises.
... An important common theme running through all these examples is that an action is only recognised as an error after the event. Human error is a judgement made in hindsight [1]. There is no special class of things we do or don't do that we can designate as errors; it is just that some of the things we do turn out to have undesirable or unwanted consequences. ...
Full-text available
This chapter introduces the topic of error as an essential foundation for an understanding of patient safety. We introduce psychological classifications of error and then, using clinical examples, show how we can use these ideas to understand how errors occur and how chains of small errors can combine to cause harm to patients. We outline a practical approach to conducting investigations into healthcare incidents. Finally, we offer some reflections on how doctors experience errors and how best to support yourself or your colleagues when things do not go as well as intended.
... One justification is the capability of FRAM for describing and analyzing interdependencies among system elements or functions. Unlike most conventional safety management methods that do not consider human and organizational factors (Hollnagel, 2016;Woods & Cook, 2002), FRAM considers technological, human, and organizational functions together (Hollnagel, 2012b -technical systems' processes (De Carvalho, 2011;de Vries, 2017;Kaya et al., 2019). FRAM plays a vital role in understanding complex systems. ...
Full-text available
This is a review paper of studies that have employed the functional resonance analysis method (FRAM). FRAM is a relatively new systemic method for modeling and analyzing complex socio-technical systems. This review aims to address the following research questions: (a) Why is FRAM used? (b) To what domains has FRAM been applied? (c) What are the appropriate data collection approaches in practice? (d) What are the deficiencies of FRAM? A review of 52 FRAM-related studies published between 2010 and 2020 revealed that FRAM-based models can be used as a basis for improving safety management, accident/incident investigation, hazard identification/risk management, and complexity management in complex socio-technical systems. The outcomes also showed that healthcare was the most common domain that employed FRAM (31% of the investigated studies). The results of exploring data collection methods indicated a mixed method (interview, focus group, observation) was employed in 52% of the analyzed studies, and the accident investigation report was the most popular approach in aviation-related studies. An investigation of the deficiencies of the FRAM showed that it should be upgraded by exploiting supplementary methods to enhance its analytical and computational capacity to help risk analysts and safety managers in complex socio-technical systems.
Full-text available
One of the most significant hazards workers face in the open cut mining sector is the potential for mining vehicle accidents; vehicle collisions result in 10 to 20 deaths annually within Australia. Mining vehicle interactions rely on operator decision making with no higher order controls to manage hazards associated with accidents. Hazards relate to the size and visibility of the vehicles, as well as the time they need to move around each other. Based on existing naturalistic decision making (NDM) theory, the research proposed two questions to explore the relationship between shared operator decision making and safe vehicle interactions: 1. How do operators make decisions during vehicle interactions? 2. How can shared operator decision making be influenced for safer vehicle interactions? An inductive approach was used to investigate the phenomenon which provided deep insight into individual operator decision making, by comparing individual operator decision making, and shared operator decision making. As a multimethod approach, naturalistic observations of mining vehicle operators and analysis of historical data sets (e.g. incident statements, historical focus group data and company procedures) were used to understand both normal interactions and accidents. Answering the first research question, the recognition primed decision (RPD) model was used as a theoretical frame to describe operator decision making for each vehicle interaction as the various dimensions of the model aligned closely with aspects of mining vehicle interactions. This study found that normal interactions occur when operators share decision making, linked to the same understanding of cues, interaction patterns and mental simulations, resulting in expected operator action scripts. Vehicle collisions are more likely to result when there is a breakdown of shared decision making, resulting in vehicle collision pathways which are either not recognised by one or both operators, or recognised too late for adjustment as operators run out of time to take evasive action. Answering the second research question, thematic analysis extended current NDM literature, identifying the four themes which influence shared decision making: 1. The timing of cues is important, however, as most interaction cues are passive, they may go unnoticed. Where cues are not adequately specified, there is a greater likelihood that operators will use the wrong cues to make only partially informed decisions. This emphasizes the need to ensure cues are contextually simple, salient and timely (e.g. specific light signals, defining the boundaries of work areas, triggers for the use of radio communication). 2. Collectively, if pairs of interacting operators imagine the same patterns from the cues they notice, it is more likely they share mental models. This emphasizes the need for operators to develop homogeneity of expert mental models, an understanding of common tasks and communication (e.g. training which reinforces routine interaction patterns). 3. Uncertainty reduces operators' confidence in a situation, driving them to imagine how interaction sequences may go wrong before they act, so they may slow down to obtain more information. This emphasizes the need for operators to agree on how a shared goal will allow the coordination of interdependent activities (e.g. formal and informal discussion on the meaning of cues and patterns). 4. As the complexity of interactions increase it is harder for operators to maintain cue-action relationships. Misunderstandings are problematic when operators are working closely together, and space and time are limited. This emphasizes the need to increase the adaptive capacity of situations to make them more resilient (e.g. by identifying and controlling 'choke points' by limiting vehicle speed, or the number of vehicles in an area). Contrasting current approaches within industry, the research demonstrates that the position of social constructionism provides different insights to that of positivism. Specifically, the research makes a number of contributions to current NDM literature by investigating shared decision making. Contrary to the current NDM research, which tends to assume that interacting decision makers act as individuals in isolation, the research shows that four additional theoretical frameworks are required to understand how decision makers collaborate to achieve common goals. 1. Team situational awareness provides a stronger position, highlighting the need for shared displays which provide decision makers with the right information from their own perspectives. 2. Analysis implies that decision makers in a common environment largely have shared mental models, but that the alignment of mental models in a temporal sequence can result in misunderstandings. 3. Sensemaking is more complex in shared decision making and requires decision makers to constantly search for uncertainty, cross-checking each other's actions to ensure they have the same goal. 4. Human error is inherent to shared decision making, resulting from the need to make approximate adjustments toward common goals. The environment in which shared decision making takes place requires an appropriate degree of adaptive capacity. The research includes three limitations. Firstly, the interpretation of data was restricted to the researcher's perspective of naturalistic observations and historical document analysis. Secondly, social reactivity may have influenced operator behaviours during naturalistic observations. Thirdly, the accuracy of the organisation's self-reporting. The research included reflexivity, member checking and triangulation to reduce the risk of these limitations. The research offers considerable insight into current NDM literature, finding that additional prerequisites must be satisfied for decision makers to successfully achieve common goals. The research provides theoretical and practical contributions for both researchers and practitioners to positively influence shared decision making in high risk operational environments. ii Declaration I certify that except where due acknowledgement has been made, the work is that of the author alone; the work has not been submitted previously, in whole or in part, to qualify for any other academic award; the content of the thesis is the result of work which has been carried out since the official commencement date of the approved research program; any editorial work, paid or unpaid, carried out by a third party is acknowledged; and, ethics procedures and guidelines have been followed.
Sous l'impulsion d'organismes d'accréditation ou d'autres acteurs politico-socio-économique, l'amélioration continue de la qualité (ACQ) est en passe de devenir progressivement une composante à part entière dans le management hospitalier. Cependant, même si certains programmes d'ACQ ont été implantés dans bon nombre d'organisations hospitalières, leur déficit méthodologique demeure trop manifeste. L'objectif de cette thèse est d'investir une nouvelle perspective concernant les modèles d'ACQ dédiés aux processus hospitaliers. Afin d'adapter cette dynamique à leurs singularités, de nouvelles exigences conceptuelles doivent être élaborées. A cet égard, nous proposons un modèle d'ACQ comme cadre conceptuel permettant l'adaptation de cette dynamique aux spécificités de l'organisation hospitalière. Ce modèle recentre la portée de l'ACQ sur le processus hospitalier en intégrant la modélisation, l'analyse, la reconfiguration, le pilotage du risque (sur base d'indicateurs) et la perpétuation de la logique d'amélioration. La validation du modèle a eu lieu dans le cadre de projets d'ACQ portant sur les processus du laboratoire de biologie clinique (CHU de Charleroi/Belgique) et sur la filière des urgences pré-hospitalière dans la région du Nord Pas-de-Calais (France).
High Reliability Organizations (HROs) have been treated as exotic outliers in mainstream organizational theory because of their unique potentials for catastrophic consequences and interactively complex technology. We argue that HROs are more central to the mainstream because they provide a unique window into organizational effectiveness under trying conditions. HROs enact a distinctive though not unique set of cognitive processes directed at proxies for failure, tendencies to simplify, sensitivity to operations, capabilities for resilience, and temptations to overstructure the system. Taken together these processes induce a state of collective mindfulness that creates a rich awareness of discriminatory detail and facilitates the discovery and correction of errors capable of escalation into catastrophe. Though distinctive, these processes are not unique since they are a dormant infrastructure for process improvement in all organizations. Analysis of HROs suggests that inertia is not indigenous to organizing, that routines are effective because of their variation, that learning may be a byproduct of mindfulness, and that garbage cans may be safer than hierarchies.
The approach to human reliability has been changing during the past decades, partly due to the needs from probabilistic risk assessment of large scale industrial installations, partly due to a change within psychological research towards cognitive studies. In the paper, some of the characteristic features of this change are discussedDefinition of human error and judgement of performance are becoming increasingly difficult concurrently with the change of tasks from routine activities towards decision making during abnormal situations. The nature of human error and the relationship with learning and adaptation are discussed, and the recent development of models of cognitive mechanisms behind errors is mentionedThe present approaches to human reliability within different application areas are reviewed. In industrial risk analysis, attempts are made to develop models of operators' decision making during emergency situations, and to obtain the necessary error data by simulator experiments and by systematic use of expert judgement. Simplifying assumptions are necessary for analytical risk assessment including human activities, and to make the results practically acceptable, a close coordination of risk analysis and risk management during operation appears to be necessary. In work safety, the analytical approach of risk analysis seems to be fruitful as a supplement to statistical analysis of accident reports, in particular if supported by application of cognitive models to judge the psychological feasibility of improvements. Finally, an approach to the study of traffic safety from the point of view of intentions and reasons behind behaviour is reviewed and related to the cognitive models describedThe question is finally raised as to whether the development of cognitive models will be able to serve a more effective transfer of results between these traditionally rather separate lines of research