Content uploaded by Tanya Walsh
Author content
All content in this area was uploaded by Tanya Walsh on Jun 17, 2016
Content may be subject to copyright.
This content has been downloaded from IOPscience. Please scroll down to see the full text.
Download details:
IP Address: 130.88.0.225
This content was downloaded on 17/06/2016 at 18:28
Please note that terms and conditions apply.
Human factors error and patient monitoring
View the table of contents for this issue, or go to the journal homepage for more
2002 Physiol. Meas. 23 R111
(http://iopscience.iop.org/0967-3334/23/3/201)
Home Search Collections Journals About Contact us My IOPscience
INSTITUTE OF PHYSICS PUBLISHING PHYSIOLOGICAL MEASUREMENT
Physiol. Meas. 23 (2002) R111–R132 PII: S0967-3334(02)00753-0
TOPICAL REVIEW
Human factors error and patient monitoring
T Walsh and P C W Beatty
Division of Imaging Science and Biomedical Engineering, The University of Manchester,
The Stopford Building, Oxford Road, Manchester M13 9PL, UK
Received 10 July 2001, in final form 2 January 2002
Published 9 May 2002
Online at stacks.iop.org/PM/23/R111
Abstract
A wide range of studies have shown that human factors errors are the
major cause of critical incidents that threaten patient safety in the medical
environments where patient monitoring takes place, contributing to
approximately 87% of all such incidents. Studies have also shown that good
cognitively ergonomic design of monitoring equipment for use in these
environments should reduce the human factors errors associated with the
information they provide. The purpose of this review is to consider the current
state of knowledge concerning human factors engineering in its application
to patient monitoring. It considers the prevalence of human factors error,
principles of good human factors design, the effect of specific design features
and the problem of the measurement of the effectiveness of designs in reducing
human factors error.
The conclusion of the review is that whilst the focus of human factors
studies has, in recent years, moved from instrument design to organizational
issues, patient monitor designers still have an important contribution to make
to improving the safety of the monitored patient. Further, whilst better
psychological understanding of the causes of human factors errors will in
future guide better human factors engineering, in this area there are still many
practical avenues of research that need exploring from the current base of
understanding.
Keywords: displays, alarms, human factors, human error, physiological
monitoring
1. Introduction
The operating theatre (OT), recovery room (RR) and all types of intensive care units (ICUs)
constitute what psychologists have termed cognitively complex environments. Other similar
environmentsinclude the cockpits of aircraft, operating rooms of nuclear power plants, bridges
of ocean-going liners and any other environments where the number of pieces of information
required by an operator to make a correct decision can exceed the five that can be held in
0967-3334/02/030111+22$30.00 © 2002 IOP Publishing Ltd Printed in the UK R111
R112 Topical Review
conscious working memory simultaneously. Apart from the tendency for such environments
to be associated with safety critical activities, triggering of critical incidents in all these
environments is dominated by human factors errors, which are incorrect actions performed by
the operators.
In the hospital environmentsquoted above, the instruments with which patient monitoring
is carried out have changed dramatically over the last twenty years. For instance in anaesthesia
the number of displays, alarms or waveforms on a top-of-the-range monitor has risen from
approximately four in 1970 to 23 in 2000 (Beatty 2000). All these features compete for
attention along with clinical signs from the patient. Thus the patient monitors themselves have
become a significant contributor to the overall cognitive load on staff.
Over the same period, psychologists and engineers dealing with human factors error
in other safety critical environments started to devise methods of designing systems and
equipment in ways that attempted to minimize human factors errors. The methods employed
have relied heavily on the insights of psychology into how human factors errors arise. This
discipline has become known as human factors engineering, often shortened to human factors.
Human factors engineering encompasses many different disciplines. It is a process of
design and arrangement that leads to safe and effective use of equipment. In general terms it
can be thought of as the process of utilizing design to best effect, and is often synonymous
with the term ergonomics. Using design to facilitate or support tasks both mental and physical,
whilst taking into account aspects of the user and the working environment, should result in
more favourable outcomes. When applied to patient monitoring, the hypothesis would be
that by using the principles of good human factors engineering, the potential for human error
should be reduced, with a concomitant reduction in patient morbidity and mortality.
This is not to say that by designing ergonomically and implementing safer equipment
human error will be eliminated. In the field of anaesthesia, Botney and Gaba (1994) classified
the process of monitoring into three steps: obtaining information, interpreting information
and responding to information. These steps all take place within a constant re-evaluation of
the situation. As a result, in most cases, monitoring is still largely dependent upon the user,
and there are a number of ways in which user intervention can cause a system to fail. A high
workload, particularly in times of crisis, may mean that the user does not have the capacity to
deal with all the information presented to them. Inadequate training may mean that they lack
the skills or knowledge. Information processing may go awry resulting in slips or mistakes.
Also, whilst systems can be designed to be as safe and effective as possible, they are of little
use if improperly implemented or used. Monitors should only be used for the task for which
they were designed, and by individuals with appropriate training. Finally, equipment rarely
functions properly all the time, so pre-operative checks are of paramount importance and
should never be overlooked.
The purpose of this review is to consider the current state of knowledge concerning human
factors engineering in its application to patient monitoring. It will consider the prevalence
of human factors error in environments where patient monitoring takes place, the principles
of good human factors design, the design factors in instruments that have a bearing on their
human factors performance and the problem of the measurement of the effectiveness of design
in reducing human factors error.
2. The prevalence, nature and effects of human factors error in patient monitoring
The prevalence of human factors error has been established either through retrospective or
prospective research, through clinical audit, and through voluntary reporting schemes. The
reporting of critical incidents and their determinants has been used as an aid in assessing the
Topical Review R113
extent and nature of human factors error in the OT, and also to a lesser extent in the ICU. The
sheer number of incidents reported illustrates how pervasive the problem of human error is.
One of the earliest, and perhaps the most frequently cited, studies of the extent and nature
of human factors error in anaesthesia was carried out by Cooper et al in 1978 (Cooper et al
1978). In a retrospective interview study of the 359 preventable adverse incidents recorded,
over 80% involved human errors. Of this number, 19% of incidents were due to errors
primarily involving interaction with the anaesthesia machine, and 4.5% were monitoring
errors. Equipment failure was found to be responsible for only 14% of incidents, of which
a quarter were monitor failures. Caplan et al (1997) analysed the incidence of adverse
anaesthesia outcomes originating from gas delivery equipment, based on data collected in
the American Society of Anaesthesiologists (ASA) Closed Claims Project, for occurrences
between 1962 and 1991. Claims involving equipment misuse, defined in the study as human
fault or error, were three times more common than pure equipment failure. The study
concluded that the use, or better use, of monitoring could have prevented injury in 78%
of claims.
Though the literature on human factors error in the ICU is less well developed than that
in anaesthesia, similar trends are seen where studies have been performed. For example, in a
twelve month study of preventable mishaps by Wright et al (1991), 80% of events were felt to
have been due to human error, with inexperience with equipment and shortage of trained staff
contributing to such incidents.
Startling though these figures are, one must bear in mind that retrospective studies are
inherently problematic,particularly those involving the analysis of data collected over a lengthy
period of time. Firstly, due to its very nature, retrospective analysis is not necessarily a fully
accurate representation of events that have occurred. The incidents reported are likely to be
those of a more salient nature. As Webb et al (1993) argue, “It is highly likely that unusual,
interesting or particularly dangerous incidents (e.g. air embolism, anaphylaxis) are more likely
to have been reported than mundane events (e.g. circuit disconnection that was immediately
detected because the low pressure alarm sounded)”. Research of overly long study duration is
also biased by changes in technique, procedures, equipment and the introduction of national
and international standards.
A number of other biases have been implicated in critical incident reporting. Aside from
National audits such as the Confidential Enquiry into Peri-Operative Deaths (CEPOD) in the
UK, critical incident reporting remains largely a voluntary activity except where individual
hospitals insist it is done. As a result, there may be questions regarding the utility of the
reporting process, or even a fear of recrimination. Personnel operate under an ever-increasing
workload with many other demands being placed on their time, and whilst appreciating its
importance, voluntary reporting may be far down a long list of tasks. There is a distinct
lack of formal definition as to what constitutes a critical event, making comparison studies
difficult. In an attempt to provide some coherency in the reporting of critical incidents, Banks
and Tackley (1994) put forward a set of critical incident terms for use in reporting. They
argued that it would be useful, if not essential, to have an agreed set of terms to describe the
incidents. A related problem is that critical incidents rarely occur in isolation, and there may
be some difficulty in deciding which event to record as being the primary incident. As Currie
(1989) states, situations “evolve over a period of time to produce a cascade of events, with the
primary events or cause at the beginning and the outcome at the end. At some point in this
cascade, an untoward occurrenceis noted, and thereafter a system of responses initiated, some
or all of which may be appropriate. Because of this, there may be widely differing views as to
which particular event constitutes the ‘critical incident’ itself”. Additionally, no matter how
informativeresearch findings from single sites are, they are unlikely to be representativeof the
R114 Topical Review
hospital population as a whole. In particular, research tends to focus upon teaching hospitals,
which are not necessarily representative of the general standard of practice.
The impact of salience on whether an event is reported has already been mentioned.
When events are reported, the reported incident may deviate markedly from the actual
event for a number of reasons. In a study of manual recording of peri-operative events
by anaesthetists, Rowe et al (1992) found major inaccuracies in the text entries of compiled
anaesthetic records. The most widespread problem was that of data omission. In addition,
the lack of a formal definition of terminology leads to a lack of clear criteria for entry.
Thrush (1992) analysed data resulting from consecutive automated and manual recordings
of an anaesthetic procedure when alarms had been disabled. The results from the two recording
mechanisms were quite different. For instance the lowest automatically recorded blood oxygen
saturation (SpO2) was 35%, whereas the manual record indicated a minimum SpO2of 76%.
For a true indication of the incidence and prevalence of adverse events it was recommended
that automated monitoring be used. Although based on a single case study, such evidence
illustrates an instance whereby an adverse event may have gone unnoticed and unrecorded
had it not been for the automatic recording of information, and also the potential dangers of
deactivating alarms. It seems very likely that the actual number of incidents is likely to be
significantly greater than the number reported.
Whilst many studies have attempted to document the nature and extent of critical incidents
and the role of human error, perhaps the most informative and methodologically superior is
the Australian incident monitoring study (AIMS). This is a prospective voluntary survey that
used a structured incident reporting technique in ninety hospitals and practices. Williamson
et al in 1993 (1993) found that 79% of all incidents reported to the AIMS involved human
error. Other, single site, prospective studies of relatively short duration (less than three years)
have found comparable, if slightly higher, figures (e.g. Short et al (1992); Currie (1989)).
The nature of human error involved in the critical incidents reported was varied. Error of
judgement, faults within the technique, failure to check equipment and inexperience were
frequently cited. These categories of human error were not necessarily mutually exclusive.
More than one type of error contributed to individual incidents.
Similar results have been found through simulation studies. DeAnda and Gaba (1990)in
a study, which focused on junior anaesthetists, found that human factors error accounted for
65.9% of critical incidents and fixation error, the concentration by the operator on a single
aspect of the scene neglecting other aspects, accounted for 20.5% of critical incidents. Table 1
shows the types of spontaneous human factors error discovered during this study. Fixation
error is a type of human factors error where the operator becomes inappropriately locked to
one aspect of the scene to the exclusion of other aspects that may be more important. Of the
fixation errors half were concerned with fixation on monitors. Table 2shows a breakdown of
incidents by cause of error. Many of these errors are concerned with inexperience and raise
issues of training since this study was of trainee anaesthetists during simulation for training
purposes. Finally, table 3shows a breakdown of the same errors by the phase of anaesthesia
being simulated when the error occurred. Whilst induction and recovery are recognised as the
most dangerous periods in anaesthesia the greater number of incidents occurred in this study
during the easier maintenance phase where monitoring plays its most important role. The
nature of the errors is also informative. In the study, in descending order of frequency, the
errors reported were: failure to check, inexperience, inattention, fixation, haste, distraction
fatigue and failure to follow procedure. Studies such as CEPOD have shown similar trends.
In the study of more than half a million operations, it was found that anaesthesia contributed
to 14% of all deaths, and that in almost one fifth of these deaths, avoidable errors occurred
(CEPOD 1998). It is worth noting that the rate of human factors error found in these and
Topical Review R115
Tab l e 1 . Types of spontaneous human factors error discovered during anaesthesia simulation.
Cause of critical incident (%)a
Human factors error 65.9
Fixation error 20.5
Unknown cause 10.6
Equipment failure 3.0
Tab l e 2 . Breakdown of incidents by cause of error.
Type of error (%)a
Failure to check 43.9
Inexperience 41.0
Inattention 32.7
Fixation 20.5
Haste 25.8
Distraction 14.0
Fatigue 10.8
Not following procedure 6.1
Tab l e 3 . Breakdown of the same errors by the phase of anaesthesia being simulated when the error
occurred.
When during anaesthesia? (%)a
Pre-induction 4.0
During induction 26.0
Beginning of procedure 17.0
Middle of procedure 41.0
End of procedure 9.0
After procedure 3.0
aMore than one error can apply to one critical incident so that percentage total can exceed 100%
(DeAnda and Gaba 1990).
other studies is consistent with the rates found for other complex environments, e.g. aviation
(Wiegmann and Shappell 2001). Further, whilst the number of critical incidents may have
fallen with better training over the past ten years or so, there is no evidence from studies in
any country, including the UK, that the rate of human error as causative factors within critical
incidents has fallen significantly. As has been pointed out by Johnson (1999)thismaywell
be because of the failure of human factors research to have a systematic impact on practice in
safety critical situations, including medicine.
It is clear that human factors errors are the most prevalent cause of critical incidents
in areas where patient monitoring takes place. The design of equipment, and sufficient and
appropriate training are important considerations if the incidence of human factors error is to
be reduced.
Many aspects such as the characteristics of the patient, the length of time before the error
is recognized and the time elapsed before corrective action is taken, influence the effects of
human factors error. Outcome may be either transient or permanent resulting in morbidity or
mortality. Medical personnel involved in the critical incident will also be affected, not least
by their involvement in the resultant enquiry.
R116 Topical Review
3. A psychological framework for describing human error
Allnutt (1987) has argued that all human beings make errors and that these are a completely
normal part of human cognitive function. Even under ideal conditions, performance on most
complicated tasks is rarely perfect (Wickens 1984). In most cases the error is resolved with
no lasting effect. However, in a safety critical, cognitively complex environment the potential
for error is increased, and the potential effects of such errorare life threatening. Psychological
frameworks to describe human error have drawn on studies in several of these cognitively
complex environments. The general approach has m oved away from merely quantifying errors
to attempting to identify their nature and cause. To this end, Rasmussen (Rasmussen et al
1981,1992) and Reason (1987,1990) have incorporated an information processing aspect in
a largely taxonomic approach to the study of human error.
The contributions of Rasmussen and Reason complement one another in forming
psychological frameworks most relevant to patient monitoring. Rasmussen discriminated
between three levels of human behaviour, each increasing in cognitive complexity. Further,
he sought to distinguish between causes, mechanisms and modes of human error. His three
levels were skill-based, rule-based and knowledge-based behaviours. Skill-based behaviours
involve the use of stored patterns of pre-programmed behaviour. There is little conscious
effort involved of the operator, who reacts in a largely automatic manner and thus skill-
based behaviours can be considered analogous to autonomic actions in the nervous system.
Behaviour at the skill-based level is primarily a way of dealing with routine and non-
problematic activities in familiar situations. Rule-based behaviour involves performance in
familiar settings using stored or readily available rules derived from experience or training. The
most cognitively intensive form of behaviour is knowledge-based behaviour, which is usually
brought into play when novel, unfamiliar or unplanned events unfold. This is event-specific
behaviour based on the operator’s knowledge and understanding of the system. Knowledge-
based behaviour requires higher level cognitive processes such as problem solving, goal
selection and actions which must be planned at that point in time.
The error process can typically be defined as a chain of events beginning with the
occurrence of an event in the environment, and resulting in an observableor unobservableerror.
The initial event, the cause of human malfunction, causes the release of a psychological failure
mechanism. A resulting malfunction in human behaviour is then invoked that may or may not
manifest itself in the operating system as an observable error, the external mode of malfunction.
There are many factors that determine whether an error mechanism results in a malfunctioning
response, such as performance shaping and situationalfactors. The error mechanisms invoked
differ according to the level of behaviour involved, for instance at the skill-based level, typical
error mechanisms can result in inadequacies in the control of movements. At the rule-based
level, typical errors are related to memory characteristics, the misclassification of situations
which leads to the application of the wrong rule, or the incorrect recall of procedures. At
the knowledge-based level, errors arise from resource limitations (bounded rationality)and
incomplete or incorrect knowledge.
In contrast to Rasmussen’s model, which considers the context in which the error
occurred, Reason (1987) has developed a context-free model of human error involving the
aforementioned levels of cognitive control. This model, the generic error modelling system
(GEMS), is shown schematically in figure 1. GEMS assumes that an error state has been
identified by the operator from their observations of the situation, e.g. an alarm on the SpO2
monitor may have gone off. By his actions the operator then tries to return the system to
a stable state. This is the desired goal state in the figure, i.e. no alarm with satisfactory
SpO2. The operator will first apply a simple action at the skill-based level, without any real
Topical Review R117
OK?
OK? Goal State
No
Yes Yes
Error State?
Rule-based Level
Error types: Rule-based mistakes
Error
State?
Consider local state
information
Familiar
Pattern? Apply Rule
Yes
Problem
Solved?
Yes
No
Knowledge-based Level
Error types: Rule-based mistakes
Find higher analogy
Revert to higher
mental functions
Infer diagnosis,
strategy, action,
etc.
None found
Found
Skill-based Level
Error types: Slips and Lapses
Figure 1. The generic error modelling system (GEMS) (Reason 1987).
thought, to try to resolve the matter. So in the example, the operator might see the SpO2
connector had come loose and will tighten it. If the simple actions taken do not resolve the
error state the operator goes to the rule-based level, in which they consider the general state
of the situation, so in our example they might look at all the monitor readings. If the pattern
is familiar they apply a series of actions that have shown in the past to resolve the situation.
This series of actions constitutes the rule to be applied. For instance they might notice that
the inspired oxygen level has dropped and that the inspiratory pressure is zero. This is the
pattern of a breathing system disconnection, so the rule-to apply is check the breathing system
function. Rule-based behaviours are fast and if correctly selected, very effective at resolving
error states. The final stage is knowledge-based behaviour that requires highercognitive effort.
The operator then has to work out from first principles what is wrong. GEMS suggests that
most human factors errors arise because of the selection of the wrong rule by the operator or
R118 Topical Review
the failure to realise the rule is inappropriate or to realise that they do not have the required
knowledge to synthesise a solution. Reason (1987) hypothesises that humans try to stay in
the rule-based behaviour to reduce their cognitive workload or strain. Thus human error is the
direct result of a shortage of conscious working memory. Though partly model-based, GEMS
is largely taxonomic in nature, and yields three basic error types—skill-based slips or lapses,
rule-based mistakes and knowledge-based mistakes. As with Rasmussen, central to Reason’s
theory is the distinction between the three levels of performance. Further, the GEMS classifies
errors into two categories: slips or lapses, and mistakes. Slips and lapses can be considered as
the failure of an action to proceed as intended, and occur when actions are performed without
conscious thought. A slip is an error where the wrong action is selected directly, whereas a
lapse is an error where the wrong action is selected at a mental level. Conversely, a mistake is
the failure of an intended action to achieve the desired consequence. Mistakes are technical or
judgmental errors usually due to inadequate information, inappropriate training or experience,
or insufficient supervision or support. In this context a slip or lapse is considered unintentional,
whereas a mistake is generally thought of as an actual error of judgement or intention. Weinger
and Englund (1990) state that slips are more likely to occur during activities for which one is
highly trained, and as a result, experts are more likely to make slips than novices. The design
of the system for the task in hand is paramount, for an individual is more likely to commit an
error when they are mismatched to the task, or when the system is user unfriendly.
GEMS postulates that slip-type errors occur most frequently at the skill-based level,
whereas mistake-type errors usually occur at the higher rule and knowledge-based levels. In
accordance with this theory, errors preceding detection (skill-based slips and lapses) are mainly
associated with monitoring failures like inattention and over-attention. Both are serious and
have the potential to be life threatening. Inattention involves failure to check behaviour at a
critical point beyond which routine actions branch towards a number of possible outcomes.
Over-attention occurs when a high-level inquiry is made as to the progress of an ongoing
action and an incorrect assessment is made. The current position is assessed as either being
further along or further away from the sequence of events than it actually is. It is only once an
individual has become aware that a problem has occurred that either rule-based or knowledge-
based performance is brought into play. Skill-based slips generally precede the detection of a
problem, while rule-based and knowledge-based mistakes arise during subsequent attempts to
find a solution to the initial problem. Thus, a defining condition for both rule- and knowledge-
based mistakes is an awareness that a problem exists. Errors following detection (rule-
and knowledge-based mistakes) are subsumed under the general heading of problem solving
failures. Examples of problem solving failures at the rule-based level are the misapplication
of good rules or the application of bad rules. Examples of problem solving failures at the
knowledge-based level include selectivity, distraction and overconfidence. The key feature of
the GEMS is the assertion that when confronted with a problem, human beings are strongly
biased in searching for and find a pre-packaged solution at the rule-basedlevel before resorting
to the far more effortful knowledge-based level. This occurs even when the latter is demanded
at the outset.
Using this framework, Reason proffers an explanation as to why human factors error
occurs. Rules are applied in order to avoid the cognitive cost associated with ascending to the
knowledge-based level. The operator fails to recognise the need to proceed from the rule-based
level to the knowledge-based level. The operator effectively becomes stuck at the rule-based
level applying inappropriate rules and human factors error results. The majority of critical
incidents invoving monitoring equipment detailed in the literature that arises from human
errors are a result of this problem. If we are to reduce these human factors errors we need
to design decision support systems that nudge the operator out of any state likely to get them
Topical Review R119
stuck at the rule-based level. The models of Rasmussen and Reason have recently been used in
the identification and analysis of critical incidents in an intensivecare unit (Busse and Johnson
1999).
In describing the psychological framework of human error in this review, focus has been
placed upon understanding the processes within an individual that results in error. A second
approach, termed the system or organizational approach, concentrates on how the elements
of the overall system including the individuals, interact to allow errors to become incidents
(Reason 2000). The central tenet of this approach follows from the view of Allnutt that all
humans err and that even the most intelligent, experienced and respected amongst us will
commit acts of error. In any organisation, there are usually a series of defences or barriers,
organizational, mental or physical, which must be penetrated in order for error to occur. Good
human factors design of monitoring equipment can be thought of as one such barrier, usually
one that is close to the patient affecting pre-operative or intra-operative decisions. Thus the
human factors properties of the monitoring equipment are affected mostly by active failures
well described by the GEMS model. However, the description adds an extra class of failures.
These are latent failures, which occur further away from the patient. These latent failures
afflict such things as healthcare organisation, hospital management and teamwork within the
OT or surgical department.
Reason illustrates how incidents or system accidents occur in this sort of environment
by means of a Swiss cheese model. The layers are considered as containing ever-changing
holes, which expand, contract and relocate according to circumstance and time. Holes in
these layers arise due to active failures, such as the slips, lapses and mistakes described above
associated with front-line staff, and latent conditions generated by higher-level staff such as
managers and designers. When a single layer is penetrated, the consequences are normally
minor. However, if the holes line up and the entire series of layers is penetrated, then serious
outcomes occur. This description by Reason has become extremely influential in several
areas of human factors engineering including aviation safety (Wiegmann and Shappell 2001)
and surgery (Carthey et al 2001). In the use of monitoring and other medical devices it has
informed the current Medical Devices Agency advice contained in their 2000 report Equipped
to Care (Agency 2000). It has been developed further than the original Swiss cheese model
by Zotov (1996)andO’Hare (2000) in the aviation context and by Taylor-Adams et al (1999)
in medicine.
The only class of direct actions that affect safety not accounted for in the above
models are violations. Violations are actions taken by an individual that contravene
codes of best practice, guidelines, social norms or the law. Three types of violations
are identified in the literature on the basis of etiology (Wiegmann and Shappell 2001).
Routine violations tend to be habitual by nature and are often enabled by a system of
supervision or management that tolerates them. Automatically turning off alarms on
monitoring equipment could well be an illustration of a routine violation. Exceptional
violations are isolated actions neither condoned by the culture of the environment nor typical
of the individual. The last type of violation are Malevolent Actions where harm to the
patient is deliberately intended. Very little research has been conducted into violations
concerning environments where patient monitoring takes place. The notable exception is the
findings of the MDA’sExpert Advisory Working Group on Alarms on Clinical Monitors, Use
and Practical Issues (The Advisory Working Group on Alarms on Clinical Monitors 1995)
convened in response to the Clothier Report (The Allit Inquiry) (Clothier 1994) concerning
the deliberate harming of children at the Grantham and Kesteven Hospitals. The MDA
report drew attention for the need to report the failure of any alarm on monitoring
equipment.
R120 Topical Review
4. Designing ergonomically
Allnutt (1987) argued that “Systems and procedures designed around a thorough knowledge
of human cognition and attitudes have the potential to prevent, or ameliorate, some of the
errors which are a normal and necessary part of human functioning”. To take a common-sense
approach, the most human factors friendly system will be one that that fits in with how an
individual processes information, rather than one that manipulates individuals to behave in a
way that is unnatural. Designing with the grain of the cognitive patterns of users in this way
has been advocated by Rassmussen (Rasmussen and Vicente 1987). This type of approach
has been termed ecological and has been used to support the design of display systems and
other forms of human–computer interfaces. In theory, ecologically designed systems should
cause fewer new human factors errors and address more existing types of human factors error
than non-ecologically devised systems.
In the late 1980s (revised 1993), the Association for the Advancement of Medical
Instrumentation published human factors guidelines for use by designers of medical devices
(AAMI 1988,1993). Central to these documents were the following general design
recommendations:
•Consistency
•Simplicity of design
•Safety—including the elimination or minimization of the potential for human error during
operation and maintenance under both routine, non-routine or emergency conditions
•Environmental and organizational considerations
•Console and panel layout including organizing principles, priority of control and display
location, console arrangement of controls and displays, separation of controls, control
and display integration.
•Adequate documentation
Areas where such design recommendations have been implemented are the acquisition
and provision of patient monitoring data and, to a lesser extent, intelligent alarms. Gilhooly
et al (1991) describe users’perceptions of the ABICUS system, a computerized information
system for use in the ICU. The ease with which information could be obtainedor retrieved from
this system was compared to the existing manual method of data handling. Users’perceptions
of the system were also assessed. Attitudes towards the computerized system were favourable
and its use was continued. In addition to the assessment of user perceptions, improvement in
outcome should also be considered.
In addition to explicit design recommendations of this type, principles of good practice
prior to and following the introduction and implementation of new systems have also been
used to ensure that designs are more or less ecological. End users should be involved in
system design from the outset, making the design user centred. Once completed, there should
be iterative prototyping with substantive feedback from the end users, and adequate testing
to ensure suitability and appropriate modifications. Once a system is in place, user attitudes
should be elicited in order to determine acceptance and suitability for the task.
Recent studies have shown that attitudes towards computers in health care are generally
positive (e.g. Detmer and Friedman 1994) though concerns regarding privacy and the doctor–
patient relationship continue to be expressed. A survey into the perceptions of nurses of
bedside computers indicated that the system was well-received and used over 75% of the
time during the day shift to document vital signs and measurements and intake and output
quantities (Willson 1994). Factors leading to a negative effect on the acceptability of a
system can be attributed to flaws in the design process. These include a failure to incorporate
Topical Review R121
‘an adequate knowledge of the cognitions and working practices of the eventual users’in
the design of a system (Logie et al 1997), and the employment of additional staff, usually
computer personnel, to oversee the running of the system (Alberdi and Logie 1997). These
aspects violate basic principles of good design, and involve an additional financial burden in
terms of system refinement and increased payroll costs. For the medical staff involved in
using a new system perhaps the most frustrating drawback is the recoding of parallel records
in order that comparative reliability can be established. This only serves to increase rather
than decrease the workload.
It is also important that any new system is thoroughly tested in the appropriate arena
prior to full implementation. As already mentioned, the focus of human factors research has
traditionally centred upon nuclear power plants and airplane cockpits, though the principles
are slowly filtering through to medicine. All are dynamic, complex environments involving
high workloads and vitally important monitoring processes. Watt et al (1993) detail the
analogy between the practice of anaesthesia and other high technology environments, and
also illustrate the important differences. Whilst there are similarities such as long periods
of time where activity is minimal but vigilance must be maintained, substantial differences
cannot be overlooked. Unlike the systems of airplane cockpits and nuclear power plants,
the human body and its practices are still not fully understood. Also, whilst the principles of
research, such as alar m prioritization have been successfully imported from the field of aviation
to that of anaesthesia (e.g. the perceived urgency work of Haas and Edworthy (1996)) it is
important to remember that each environment has its own complexities and idiosyncrasies.
Citing the example of intelligent alarms, Watt et al (1993) argue that many of the design
criteria for intelligent patient alarms will appear impressive on paper but should be thoroughly
investigated in the clinical setting.
The relatively recent emergence of cognitive ergonomics has led to further changes in
emphasis. Traditional ergonomics focused on improving the design and layout of hardware
by making equipment physically easier to use—cognitive ergonomics focuses on coping
with varying workloads and maintaining vigilance and attention in both demanding and
undemanding periods of work. Whilst classical ergonomics concentrates on the physical
aspects of work and human capabilities such as force, posture and repetition, cognitive
engineering focuses on the outcome or product that results from the efforts of the work
system as well as the work in itself (Hollnagel 1997). Based on the results of previous
research (e.g. Wallace et al 1994,Westenskow et al 1992). Eisenkraft (1997) postulates the
following requirements for electronic monitoring: It should be “user-friendly, automatically
enabled when needed, have alarm threshold limits easily bracketed to ‘normal’conditions, be
intelligent (‘smart alarms’), and the alarm signal emitted should be appropriate in terms of
urgency, specificity, and loudness, depending on error detected and on workplace conditions”.
5. Human factors design in patient monitors
5.1. The auditory versus the visual channel for communication
The channel of presentationof information, visual or auditory, will depend upon the nature and
urgency of information to be transmitted. The auditory channel is preferred when information
occurs randomly or requires the immediate capture of information (Botney and Gaba 1994).
For example the auditory channel will be suitable for the presentation of information from the
pulse oximeter or the heart rate from the ECG. However, it has yet to be established whether
listeners are reliably able to detect all informationpresented. This is particularly true for those
with hearing loss (Wallace et al 1994).
R122 Topical Review
Alarms are an important component of any monitoringsystem, and in their simplest form
used to attract the user’s attention. As such, alarms can be either visual or auditory. Research
has concentrated upon problems with and improvements to auditory alarms, reflecting the
tendency in practice for attention attracting alarms to be auditory. There are two main
difficulties associated with visual alarms. In a conventional OT, the patient is in front of the
anaesthetist and the monitor displays and controls behind or to the side. As such, the time taken
to detect a visual alarm is significantly greater than that taken to detect an auditory alarm. In
a study designed to assess time taken by anaesthetists to respond to a simulated alarm, Morris
and Montano (1996) found that response times to visual alarms were significantly longer
than response times to audible alarms. They suggest that it is safer to rely on audible rather
than visual alarms when time-critical information such as oxygenation, heart rate change
and ventilator disconnection is concerned. A further disadvantage of visual alarms concerns
compliance rates. The rates for visual warnings have been found to be low, despite being
readily understood (e.g. Braun and Silver 1995,Braun et al 1995).
Commentating on the increasing complexity and data collection required of anaesthetists,
Frankenberger et al (1990) suggest the following objectives for information technology within
a patient monitoring system:
•Introduction of a hierarchy of alarms indicating the response required—emergency,
cautionary or advisory. This would follow the urgency mapping concept developed
by Edworthy et al (1991)
•A new structure involving pre-processing of signals, reduction of redundant information,
and centralized display of the result. Information to be collected in an integrated Alarm
and Information Management System
•A uniform alarm philosophy which facilitates user operation
•Minimization of artefacts
5.2. Auditory alarms
A general introduction to the issue of auditory alarms in the OT can be found in Hayes (1997).
It is clear from this review and others that currently auditory alarms suffer from a number
of shortcomings. In a summary of the work by Meredith and Edworthy (1995), Haas (1998)
details the critical problems with auditory alarms in the ITU and OR environment. There is
often an abundance of alarms leading to problems with identification. As Botney and Gaba
(1994) state, “Nearly every monitor and therapeutic device has a variety of alarms”.Further,
acoustically similar alarms make it difficult to differentiate one from another. If an alarm is
indistinguishable from another, or is not easily located, the practitioner will spend valuable
time in identifying the source. In an early study McIntyre and Stanford (1985) set out to
determine whether alarms that sounded the same could be identified through spatial auditory
discrimination. They found that in certain circumstances, if the alarm sounds were the same or
very similar, the anaesthetist “could not immediately and infallibly decide which alarm signal
was occurring”.Theirfindings, that in a typical working environment many anaesthetists
would fail to identify some auditory alarm signals, were also demonstrated later by Loeb
et al (1990). This inability to distinguish or recognise alarm sounds is a particular problem
for new staff.
Since they are designed to direct attention away from the task at hand, auditory alarms
may be considered to be unpleasant or irritating, often invoking negative reactions in the
clinical staff, particularly when they continue to warn of a situation that is currently being
dealt with. The response to the alarm may be for the clinical staff to disable the alarm rather
than check the condition of the patient (McIntyre and Stanford 1985). Khan and Loeb (1993)
Topical Review R123
found that alarms were disabled on physiologic and airway/gas monitors about 75% of the
time by anaesthetists. In addition simplistic alarms that indicate only that there is a problem
fail to direct the user towards an appropriate response. As such auditory alarms are often
considered to be distracting rather than helpful.
Research has shown that a large percentage of alarms are likely to be spurious or false
alarms. Kestin et al (1988) found that during routine surgery anaesthetic management of 50
patients with no significant cardiac or respiratory disease, an alarm sounded on average every
four and a half minutes. Seventy-five percent of all alarms sounded were false alarms, while
only 3% indicated risk to the patient. Similarly, Wiklund et al (1994) determined the frequency
of true and false alarms from different monitoring devices in a post-anaesthesia setting. The
average frequency of pulse oximetry alarms was once every 8 min, with 75% of alarms being
false. False alarms were sounded as a result of sensor displacement, motion artefacts, poor
perfusion or a combination of these factors. Apnoea alarms occurred on average once every
37 min with 28% of these being false alarms. ECG monitoring was found to have a low alarm
rate, but a high proportion of false alarms. Reducing the sensitivity of the alarm in a bid
to decrease the number of false alarms only results in a reduction of the number of serious
situations that will be detected. Finally, the provision of single tone alarms means that there
is no relationship between the perceived urgency of the auditory alarm and the urgency of the
clinical situation. In a study by O’Carroll (1986), the origin and frequency of alarm soundings
in a general purpose ITU were recorded by nursing staff. Over the three-week period of study,
many false alarms were emitted by the monitoring system. Only eight out of a total 1455
soundings indicated potentially threatening problems. It would make sense to suggest that
only a limited number of high-priority alarms should exist in anaesthetizing locations.
The significant problems associated with current auditory alarms mean that they are clearly
failing the medical personnel they are designed to assist. At best the alarms may alert the
anaesthetist to the occurrence of an untoward situation, but often fail to elicit the appropriate
reaction. False alarms or inappropriate performance lead to a rapid and catastrophic loss of
confidence in the system. Webb et al (1993) carried out a study of the monitors involved in
detecting the first 2000 incidents reported to AIMS. They found that “amongst the incidents
analysed, many involved problems which should have been detected early but progressed
because a monitor was not available, was available but not in use, or was in use but being used
incorrectly (e.g. with an alarm incorrectly set or turned off)”.
The two central requirements of an ergonomic non-verbal alarm are that it needs to be
heard, and that it is psychologically appropriate in order that learning time will be reduced
and a suitable response elicited (Edworthy (1994)). For the alarm to be truly ergonomic, the
severity or nature of the problem should be indicated in some way, rather than the simplistic
‘there is a problem’alert. A large volume of research has been devoted to alarm sounds,
which includes study on sound associations and manipulation of sound parameters. An
important and influential systematic attempt to design sounds with such properties were the
alarm sounds devised by Patterson (1982) and intended for use in aircraft. These alarm sounds
were later extended for use in medical equipment Patterson et al (1986). Patterson’ssystem
was really a method of design rather than a series of sounds as such. It incorporated the idea
of change of pitch and repetition frequency to indicate progression of urgency. Block (1992)
examined the use of musical alarm tones with themes from popular songs for oxygenation,
ventilation, cardiovascular and temperature monitoring, artificial perfusion, and drug
administration systems, with a view to easier identification of alarm sounds. The study found
that familiarity with popular songs could help produce a set of alarm tones that would also be
familiar. Block has also proposed a set of variable pitch sounds that comply with the current
international standards for alarm sounds based on Patterson’s study (Block et al 2000).
R124 Topical Review
Similarly, Stanton and Edworthy (1998) compared representative nomic, symbolic and
metaphoric sounds with traditional auditory warnings for devices in the ITU. They found
differences in recognition performance for ITU and non-ITU staff. ITU staff were better at
identifying equipment with traditional warnings, and non-ITU staff were significantly better
at identifying equipment with the new warnings. Ideally alarm sounds should evoke actions
appropriate to the situation they are warning about. The property of a device to elicit the
correct action is described as its affordance. The result of the Stanton and Edworthy study
suggests that familiarization with sounds is essential to prime staff to the appropriate action
associated with the sounds and hence optimise the sound’s affordance.
A simple form of affordance engineering is urgency mapping, whereby acoustic
parameters are manipulated in order to relate the perceived urgency of the alarm to the
urgency of the situation. It should be applied to alarms to provide an indication of the priority
of the clinical situation as it progresses. Thus the sound chosen for an alarm should map
appropriately the subjective urgency induced in the audience to the objective risk. This would
be invaluable in times of high workload, since it would allow the anaesthetist to prioritize
clinical situations, attend to the most urgent situations first, and temporarily ignore low
priority alarms until they can be dealt with. Momtahen and Tansley (1989) reported that there
is sometimes no relationship between the implicit urgency indicated by a sound’s parameters
and the degree of urgency the listener associates with the situation itself. Edworthy et al
(1991) have studied this concept primarily in the field of aviation. There is a growing body
of research dedicated to making alarms in medicine more ecologically valid through the use
of context and urgency mapping. However, applying urgency mapping to a clinical situation
is by no means straightforward, and entails two distinct phases. The firstistoestablishthe
perceived urgency of the acoustic parameters. The second is to establish the perceived urgency
of the clinical situation. Whilst empirical evidence states that some acoustic parameters are
perceived as more important in terms of their perceived urgency than others, the construction
of a generally accepted and calibrated scale of severity for clinical situations has yet to be
performed. However, Edworthy et al (1991) have found that individuals are able to reliably
rank auditory warnings according to their urgency. The warnings differed in terms of pitch,
speed and harmonic content.
Finley and Cohen (1991) studied the perceived auditory urgency of ten common alarms
and their corresponding clinical situations as determined by twelve senior anaesthetists. The
purpose of the study was to evaluate the perceived urgency of the auditory signal and its
correlation with the urgency of the corresponding clinical situation. The study can however be
criticized for the way in which the perceived urgency of the clinical situations were assessed.
A very small number of assessors were involved in the study. Twelve senior anaesthetists
were asked to rate the clinical situations. Contextual factors such as the characteristics of the
patient were also ignored.
For an auditory alarm to be of optimum use, the warning information provided by the
urgency mapping must be presented in such a way that even an inexperienced anaesthetist
with poor vigilance will be alerted. Additionally, context sensitivity could be incorporated
whereby alarm limits are set based on initial patient parameters. Many types of artefact that
are present in physiological signals are not filtered or identified in even the most advanced
system. Watt et al (1993) suggest that the monotonous observation of patient data can be
more efficiently carried out by intelligent technology such as rule-based algorithms, artificial
neural networks, expert systems and fuzzy logic designed to recognise pattern deviations.
Constructing a suitable scale of severity is problematic not least due to the fact that
there is no objective measure of outcome. The risk to the patient must first be derived
from the expert opinion of the clinician taking into account many factors, not least the
Topical Review R125
pH VO2
pCO2
pO2
SVRI
HR
RR
SVRI
CI
PCWP
MAP
pH
VO2
pCO2
pO2
CI
PCWP
MAP
HR
RR
Figure 2. Two types of polygon display. (Left) A generic polygon or radial display for
cardiovascular parameters in the ITU. On the right is the Aberdeen Polygons (Green et al 1996)for
the same set of measurements. These are colour-coded, target values, green, danger values, red,
warning ranges, white, which is hypothesized to increase the ability of users to detect significant
changes in the vertices of the polygon. (Key: CI: cardiac index, SVRI: stroke volume resistance
index, pO2arterial partial pressure of oxygen, pCO2arterial partial pressure of carbon dioxide,
VO2oxygen uptake, pH blood acidity/alkalinity, RR: respiration rate, HR: heart rate, MAP: mean
arterial pressure of blood, PCWP: pulmonary capillary wedge pressure).
characteristics of the patient. Research into the area of urgency mapping of auditory alarm
to severity of the clinical situation for the patient can only proceed once a satisfactory and
suitably calibrated hierarchy has been constructed. However, it is possible to evaluate patient
monitoring devices. In what has become a relatively dated piece of research, Myerson
et al (1986) evaluated 13 commercially available monitoring devices. Of this number, only
two devices were recommended for general use as ventilator alarms, and a further three as
disconnect alarms. In each instance the recommendations were subject to certain reservations
and suggested modifications.
5.3. Displays designs for presentation of information
There are two main methods of displaying data-alphanumeric and graphical, and the choice
of display is dependent upon the type of data involved. Alphanumeric displays are used for
quantitative data and data that are slowly changing. Graphic displays are best for qualitative
data, and data that change quickly. Graphics displays include waveforms, trend plots, figural
or object displays, and have the advantage of being able to illustrate relationships between
physiological variables.
Polygon displays (see figure 2) have been found to be a useful method of drawing attention
to an abnormal situation, but less helpful in facilitating identification of the parameter in
question. The difficulty associated with identifying exactly which vertices are abnormal
increases as the number of vertices increases. A trade-off must be made between the number
of vertices the polygon consists of and the time taken to identify the abnormal vertices.
As the number of vertices increases, the reaction time to identify the abnormal parameter
increases. Moreover, reaction times increase in line with the variability of the polygon
(Greaney and MacRae (1996). Measuring reaction time to identify whether a polygon was
regular or irregular, and the location of the abnormal parameter, Green et al (1996) found that
R126 Topical Review
non-medically qualified participants could readily extract information from polygons
comprising eight or ten sides but with an advantage for simpler polygons and for information
displayed at the top of the diagram. Colour coding was found to aid in removing these biases
and also resulted in faster processing times.
Trend displays are recommended in the display of continually evolving information.
However, for trend displays to be beneficial the design should be suitably ergonomic, staff
must be suitably trained in its use, and able to distinguish between normal and abnormal states.
The interpretation of trend data can be a complex process, particularly to those who are new
to the method. An individual’s ability to detect an unanticipated signal while observing a
stream of data will be influenced by many factors, including the signal’s frequency, salience
and duration, as well as the state of the observer (Gurushanthaiah et al 1995). In a neonatal
intensive care unit computerized trend monitoring was found to be a useful decision-making
aid for both junior and senior staff, although junior doctors often failed to take advantage of
the information (Alberdi and Logie 1997).
Alternatives to the traditional graphical displays have been found to be beneficial in
teaching the relationships between physiological variables. Effken et al (1994) used Gibson’s
theory (Gibson 1966) of direct perception as the basis for the development of display designs
that exploit perception by showing the inherent relationships between data elements. A
strip chart, integrated balloon display, and etiological potential display were compared for
their ability to teach the fundamentals of haemodynamic monitoring and control. The study
found that perceptually based displays facilitated the learning and practice of this type of
monitoring, for novices as well as experts. In assessing the relative efficacy of visual displays
in anaesthesia Gurushanthaiah et al (1995) stress the importance of using clinical subjects
in a clinical environment. In a comparison of numeric, histogram and polygon displays for
detecting change, the histogram and polygon displays were found to be better than the numeric
display, but only for the medical participants. There was no difference for the non-medical
participants. Jungk et al (1999) proposed an ecological display, designed using the principles
outlined by Vicente and Rasmussen (1992)andaProfilogram horizontal bar-chart display
for haemodynamic monitoring. Using various methods of assessment including eye tracking
they were unable to show a distinct advantage for the ecological display. Subsequently, they
attempted to improve the situational awareness of the ecological display by increasing the
number of parameters represented, adding a type of polygon display and using Gaba’s model
of decision making in anaesthesia as a guide (Gaba 1989). In this study they were able to
show better performance for two versions of the same ecological display when compared to
a conventional trend display (Jungk et al 2000). Blike and colleagues have demonstrated an
object display for the monitoring of shock constructed after structured interviews of decision
making with three cardiac anaesthetists (Blike et al 1999). The display improved no-shock
recognition by 1 s and shock etiology recognition by 1.4 s when compared to alphanumeric
information. In a subsequent study they were able to simplify the display whilst increasing its
intelligibility (Blike et al 2000).
5.4. Data fusion and smart alarms
Data fusion, the integrated processing of many signals, is a common component of many smart
alarms. Integrated monitoring is a term used to define the ‘simultaneous and interdependent
evaluation of multiple measurements to produce an ongoing status assessment and to identify
the source of a real or potential problem’(Mylrea et al 1993). Multiple sourcing, along with
identifying abnormal rates of change has been shown to reduce false alarms and increase true
positive alarms for heart rate. Integrating signals can overcome the problems of noisy or
Topical Review R127
missing signals that arise when each signal is assessed separately and without considering
the overall context. Beinlich and Gaba (1989) implemented a logical alarm reduction
mechanism (ALARM), a diagnostic system for patient monitoring that calculates probabilities
for a differential diagnosis based on the available data and a belief network for probabilistic
reasoning. Rather than producing alarms based on individualdata points, this integrated alarm
system generated text messages indicating higher-level problems.
Alarms of this sort are only as good as the knowledge base and statistics behind
them. The advent of intelligent alarms has been particularly welcomed, though they have
yet to be extensively implemented. Advances in signal detection capabilities and data
management, coupled with a decreasing cost of implementing information technology should
mean that methods such as fuzzy logic (Oberli et al 1999) and neural networks are increasingly
incorporated into patient monitoring systems. In practice, however,such methods have largely
been confined to the academic research sector.
Cerrutti and Saranummi (1997) argue that an important reason for the lack of uptake of
such methods lies in the fact that these systems have been developed using a very limited data
set, and have yet to receive adequate testing. To this end, the IMPROVE project (Korhonen et al
1997,Nieminen et al 1997) has concentrated on the acquisition of an annotated data library
of ICU patients that can be used for the development and testing of biosignal interpretation
(BSI) methods.
Intelligent alarms have been implemented in many different guises, though the studies
have almost invariably been experimental and on a small scale. Van Oosterom et al (1989)
developed an intelligent system to read and interpret signals from different monitors, resulting
in more specific, intelligent alarm messages. Twenty different malfunctions were introduced in
real-time tests on an anaesthesia simulator, of which 93% were identified correctly. Harriman
et al (1993) compared three different types of alarm system—the default alarm system, the
baseline alarm system and an adaptive filtering alarm system. The default alarm system
consisted of wide default alarm limits. The baseline system involved setting alarm limits
based on patients’baseline conditions. The adaptive filtering alarm system was designed to
combine the low false alarm rate of the default alarm system with the improved vigilance of
the baseline alarm system. During the 20 cases presented, 13 alarm-worthy events occurred
during which 7, 10 and 13 events were detectedby the respective alarms systems. Westenskow
et al (1988) found that a rule-based expert alarm system could correctly identify 619 out of 660
simulated events. Orr et al (1990) and Orr and Westenskowe (1990) found that response time
with neural network-based smart alarms was greatly reduced when compared to conventional
alarms generated by an Ohmeda CO2monitor, expiratory flow monitor, O2monitor and airway
pressure monitor. Rheineck-Leyssius and Kalkman (1998,1999a,1999b) looked at the relative
effects of artefact rejection, alarm delay, averaging, median filtering and decreasing the alarm
limit from SpO2<90% to SpO2<85% on the number of true and false alarms elicited by a
pulse oximeter. Of all the methods assessed, changing the alarm limit to <85% reduced the
number of alarms by 82%. A similar reduction of alarms was obtained with an alarm delay of
18 s and an averaging or median filtering epoch of 42 s. However, setting the alarm limit to a
lower level resulted in a lower proportion of true alarms.
6. Measurement of effectiveness
Determining the utility of any patient monitoring system requires that there be attainable and
appropriate measures of effectiveness. The most easily measured definition of effectiveness
is usability, i.e. how easy or difficult the user finds the system to use. It can be measured
R128 Topical Review
in several ways including questionnaire. Usability can also be included in the design stage
of device implementation by using prototyping systems (Edworthy and Stanton 1995). It has
also been suggested that poor usability can contribute to the development of the cry wolf effect
(Breznitz 1983), which in its turn may be potentiated by the prejudices and attitudes that users
have to types of features on monitoring instruments (Nazir and Beatty 2000). However, in
terms of the breadth of the issues discussed in the rest of this review it is a rather inadequate
measure. Unfortunately, going beyond usability trials is very difficult.
Byrick and Cohen (1995) argue that studies that have attempted to evaluate monitors are
often inadequate in four important ways.
•Rarely is there a clear definition of the clinical issue studied, resulting in a lack of specific
measurable outcome.
•The large numbers of patients required to detect clinically significant improvements in
morbidity associated with rare critical incidents prevent randomized clinical trials.
•The use of more frequently occurring intermediate outcomes such as hypoxaemia may
have little direct relationship to true outcomes.
•The lead-time provided by the introduced monitors may not necessarily result in an
improved outcome if it is insufficient to allow corrective behaviour, or if there is no
corrective behaviour applicable.
The result is that there are very few examples of systematic effectiveness measurement in
the literature. Two exceptions are the studyof computerized physiologic trend data monitoring
of newborn infants requiring intensive care by Cunningham et al (1998) and the large-scale
randomized evaluation of pulse oximetry by Moller et al (1993a,1993b). In the neonatal
study, no improvements in patient outcomes were found when an integrated displays system
hypothesized to have better human factors properties and with high usability scores from
clinical users, was introduced into a neonatal ITU. The authors acknowledge that it was possible
that the trend monitoring failed to detect any subtle improvements due to the randomized
controlled study design. An alternative design may have been more appropriate. The pulse
oximetry trial of over 20 000 patients revealed, as anticipated, that the number of intermediate
outcomes detected, such as hypoxaemia, was greater in the monitored group, but that there
was no difference in the number of postoperative complications.
An interesting half-way house to full clinical trials may be the use of anaesthesia
simulators. There are a wide variety of anaesthesia simulators ranging from simple
simulators for teaching as single skill, through part mission simulators that use a manikin and
some anaesthetic equipment to simulate anaesthesia (Byrne et al 1994), to full mission total
immersion simulators which involve a full OT team (Meurs et al 1997). The focus of these
simulators has been to address the safety culture of anaesthetist behaviour by feedback to
individuals about their performance during simulated critical incidents or as a basis for
examining how individuals interact as a complete OT team. However, as Blike (1999)
points out if the first challenge to human factors engineering in the OT is to develop realistic
experimental conditions for the testing of devices then simulation offers the advantage of a more
realistic task for the user than a carefullycontrolled but limited pure psychological experiment.
He praises Jungk et al (1999) for their use of simulator based testing of their ecological
interface for haemodynamic monitoring. Other anecdotal evidence suggest that instrument
manufacturers are using full mission simulators to test the human factors performance of
their equipment at prototype stage, though none of this work, which is clearly commercially
sensitive, seems to have found its way into the public domain as yet.
Topical Review R129
Based on the diagnostic imaging study of Fryback and Thornbury (1991), Byrick and
Cohen (1995) provide a template for the assessment of anaesthesia technology that consists
of the following:
•Basic science of the technology
•Site/indications for use
•Efficacy
•Effectiveness
7. Conclusions
It is clear from this review that more research and development of the psychologicalframework
for describing the interaction of human factors error and patient monitoring is required.
However, the frameworks that do exist offer descriptions that suggest new methods of design
suitable for experimental study. What is preventing the design and testing of new human factors
features for patient monitors is not the detail of the psychological framework, but the problem
of meaningful measures of effectiveness that can be validated in terms of a reduction in human
factors errors in real clinical situations. In this situation, the rarity of the critical incidents in
clinical practice results in prolonged trial periods and the multitude of factors affecting the
outcome, make the use of blinded, controlled trials ineffective. The study of Cunningham
et al (1998) vividly illustrates this. A range of surrogate measures of performance needs to
be developed that can be validated for their effect on recognized types of human factors error.
We would suggest that validation of such measures can only be done using some sort of total
immersion simulation such as that available for anaesthesia.
Short of properly validated surrogate measures of clinical effectiveness, subjective
assessment of effectiveness by users should not be ignored. Affordance engineering can
proceed on the basis of subjective assessment by users, particularly if this is done in the
context of user-centred design and prototyping. The extension of urgency mapping to visual
alarms and other types of display is of particular interest in this context.
Lastly, more systematic investigation of the basic human factors and psychophysical
characteristics of devices such as head mounted displays, whose unit cost will fall as they are
used in other areas such as mobile communications, should be progressed. This study should
aim at building a library of characteristics of these devices so that they can be used selectively
as the psychological models grow in sensitivity and accuracy.
Whilst the focus of human factors studies has, in recent years, moved from instrument
design and the application of technology towards the deconstruction of both the organization
and system, it is time that patient monitor designers reasserted that they have their contribution
to make in improving the safety of the monitored patient.
References
AAMI 1988 Human factors engineering guidelines and preferred practices for the design of medical devices, aami
he-1998 (Arlington, VA: Association for the Advancement of Medical Instrumentation)
AAMI 1993 Human factors engineering guidelines and preferred practices for the design of m edical devices, ansi/aami
he48 (Arlington, VA: American National Standards Institute/Association for the Advancement of Medical
Instrumentation)
Agency M D 2000 Equipped to care: the safe use of medical devices in the 21st century (London: Medical Devices
Agency)
Alberdi E and Logie R 1997 Cognitive engineering: a symbiosis between “cognitive”knowledge and “engineering”
design 20th Annual Meeting of the Cognitive Science Society (1997)
Allnutt M 1987 Human factors in accidents Br. J. Anaesth. 59 856–64
Banks I and Tackley R 1994 A standard set of terms for critical incident recording? Br. J. Anaesth. 73 703–8
R130 Topical Review
Beatty P 2000 Advances in patient monitoring Horizons in Medicine vol 12 ed P Weissberg (London: Royal College
of Physicians of London) pp 395–407
Beinlich I and Gaba D 1989 The alarm monitoring system—intelligent decision making under uncertainty
Anesthesiology 71 A337
Blike G T 1999 The challenges of human engineering research J. Clin. Monit. Comput. 15 413–5
Blike G T, Surgenor S D and Whalen K 1999 A graphical object display improves anesthesiologists’performance on
a simulated diagnostic task J. Clin. Monit. Comput. 15 37–44
Blike G T, Surgenor S D, Whalen K and Jensen J 2000 Specific elements of a new hemodynamics display improves
the performance of anesthesiologists J. Clin. Monit. Comput. 16 485–91
Block F 1992 Evaluation of users’abilities to recognize musical alarm tones J. Clin. Monit. 8285–90
Block F E, Rouse J D, Hakala M and Thompson C L 2000 Aproposed new set of alarm sounds which satisfy standards
and rationale to encode source information J. Clin. Monit. Comput. 16 541–46
Botney R and Gaba D M 1996 Human Factors in Monitoring Monitoring in Anesthesia and Critical Care 3rd edn,
ed C D Blitt and R L Hines (New York: Churchill Livingstone) pp 23–54
Braun C C and Silver N C 1995 Interaction of signal word and color on warning labels—differences in perceived
hazard and behavioral compliance Ergonomics 38 2207–20
Braun C C, Kline P B and Silver N C 1995 The influence of color on warning label perceptions Int. J. Ind. Ergonomics
15 179–87
Breznitz S 1983 Cry-wolf: The Psychology of False Alarms (Hillsdale, New York: Earlbaum) pp 1–100
Busse D and Johnson C 1999 Identification and analysis of incident s in complex, medical environments Fir st Worksho p
on Human Error and Clinical Systems (Glasgow, Scotland 1999)
Byrick R and Cohen M 1995 Technology assessment of anaesthesia monitors: problems and future directions Can.
J. Anaesth. 42 234–9
Byrne A J, Hilton P J and Lunn J N 1994 Basic simulations for anesthetists—a pilot-study of the access system
Anaesthesia 49 376–81
Caplan R, Vistica M, Posner K and Cheney F 1997 Adverse anesthetic outcomes arising from gas delivery equipment:
a closed claims analysis Anesthesiology 87 741–8
Carthey J, de Leval M R and Reason J T 2001 The human factor in cardiac surgery: errors and near misses in a high
technology medical domain Ann. Thorac. Surg. 72 300–5
CEPOD 1998 The report of the national confidential enquiry into perioperative deaths 1996/97 London
Cerutti S and Saranummi N 1997 Improving control of patient status in critical care IEEE Eng. Med. Biol. 19–20
Clothier C 1994 The Allit inquiry (London: HMSO)
Cooper J, Newbower R, Long C and McPeek P 1978 Preventable anesthesia mishaps: a study of human factors
Anesthesiology 49 399–406
Cunningham S, Deere S, Symon A, Elton R and McIntosh N 1998 A randomized controlled trial of computerized
physiologic trend monitoring in an intensive care unit Crit. Care Med. 26 2053–60
Currie M 1989 Prospective study of anaesthesiacritical events in a teaching hospital Anaesth. Intensive Care 403–11
DeAnda A and Gaba D M 1990 Unplanned incidents during comprehensive anaesthesia simulation Anaesth. Analg.
71 77–82
Detmer W and Friedman C 1994 Proc. 18th Annual Symposium on Computer A pplications in Medical Care: Academic
physicians’assessment of the effects of computers on health care (Washington 1994) 558–62
Edworthy J 1994 The design of non-verbal auditory warnings Appl. Ergonomics 25 202–10
Edworthy J and Stanton N 1995 A user-centred approach to the design and evaluation of auditory warning signals: 1.
Methodology Ergonomics 38 2262–80
Edworthy J, Loxley S and Dennis I 1991 Improving auditory warning design: relationship between warning sound
parameters and perceived urgency Hum. Factors 33 205–32
Effken J A, Nam-Gyoon K and Shaw R E 1994 Making the relationship visible: testing alternative display design
strategies for teaching principles of hemodynamic monitoring and treatment presented at AIME
Eisenkraft J 1997 A commentary on anesthesia gas delivery equipment and adverse outcomes, Anesthesiology 87
731–3
Finley G and Cohen A 1991 Perceived urgency and the anaesthetist: responses to common operating room monitor
alarms Can. J. Anaesth. 38 958–64
Frankenberger H, Hecker E and Weis K 1990 Anaesthesia equipment alarms and monitoring facilities: present state
of development Eur. J. Anaesthesiol. 789–95
Fryback D and Thornbury J 1991 The efficacy of diagnostic imaging Med. Decision Making 11 88–94
Gaba D 1989 Human error in anesthetic mishaps Int. Anesthiol. Clin. 27 137–47
Gibson J 1966 The Senses Considered as Perceptual Systems (Boston: Houghtin-Mifflin)
Gilhooly K, Logie R, Ross D, Ramayya P and Green C 1991 Users’perceptions of a computerised information system
in intensive care (ABICUS) on introduction and after 2 months use Int. J. Clin. Monit. Comput. 8101–6
Topical Review R131
Greaney J and MacRae A 1996 Diagnosis of fault location using polygon displays Ergonomics 39 400–11
Green C A, Logie R H and Gilhooly K J 1996 Aberdeen polygons: computer displays of physiological profiles for
intensive care Ergonomics 39 412–28
Gurushanthaiah K, Weinger M and Englund C 1995 Visual display format affects the ability of anaesthesiologists to
detect acute physiologic changes Anesthesiology 83 1184–93
Haas A 1998 The design of auditory signals for ICU and or environments J. Clin. Eng. 23 33–36
Haas E C and Edworthy J 1996 Designing urgency into auditory warnings using, pitch, speed and loudness Comput.
Control Eng. J. 193–8
Harriman A, Watt R, Hameroff S, Masalana E, Navibi M and Myrlea K 1993 Dynamic adaptive filtering for control
of anesthesia monitoring alarm systems Anesthesiology 79 A448
Hayes B 1997 Alarms, in non-invasive cardiovascular monitoring Principles and Practice Series ed B Hayes (London:
BMJ Books) ch 15
Hollnagel E 1997 Cognitive ergonomics: it’s all in the mind Ergonomics 40 1170–82
Johnson C 1999 Why human error modeling has failed to help systems development Interact. Comput. 11 517–24
Jungk A, Thull B, Hoeft A and Rau G 1999 Ergonomic evaluation of an ecological interface and a profilogram display
for hemodynamic monitoring J. Clin. Monit. Comput. 15 469–79
Jungk A, Thull B, Hoeft A and Rau G 2000 Evaluation of two new ecological interface approaches for the anesthesia
workplace J. Clin. Monit. Comput. 16 243–58
Kestin I, Miller B and Lockhart C 1988 Auditory alarms during anaesthesia monitoring Anaesthesiology 69 106–9
Khan A and Loeb R 1993 Anesthesiologists’responses to audible alarms Anesth. Analg. 76 S185
Korhonen I, Ojaniemi J, Nieminen KvG M, Heikela A and Kari A 1997 Building the improve data library IEEE Eng.
Med. Biol. 25–32
Logie R, Hunter J, McIntosh N, Gilhooly K, Alberdi E and Reiss J 1997 Medical cognition and computer support in
the intensive care unit: a cognitive engineering approach Engineering Psychology and Cognitive Ergonomics
Volume Two Job Design and Product Design (Ashgate, 1997) ed D Harris 197 –174
Leob R, Jones B, Behrman K and Leonard R 1990 Anesthetists cannot identify audible alarms Anesthesiology 73
A539
McIntyre J and Stanford L 1985 Ergonomics and anaesthesia: auditory alarm signals in the operating room
Anaesthesia—Innovation in Management (Berlin: Springer) pp 87–92
Meredith C and Edworthy J 1995 Are there too many alarms in the intensive care unit? An overview of the problem
J. Adv. Nurs. 21 15–20
Meurs W v, Good M and Lampotang S 1997 Functional anatomy of full-scale patient simulators J. Clin. Monit. 13
317–24
Moller J et al 1993a Randomized evaluation of pulse oximetry in 20,082 patients: I. Design, demography, pulse
oximetry failure rate, and overall complication rate Anesthesiology 78 436–44
Moller J et al 1993b Randomized evaluation of pulse oximetry in 20,082 patients: II. Perioperative events and
postoperative complications Anesthesiology 78 445–53
Momtahan K and Tansley B 1989 An ergonomic analysis of the auditory alarm signals in the operating room and the
recovery room Annual Conference of the Canadian Acoustical Association (Halifax, NOva Scotia, 1989)
Morris R W and Montano S R 1996 Response times to visual and auditory alarms during anaesthesia Anaesth Intensive
Care 24 682–4
Myerson K, Ilsley A and Runciman W 1986 An evaluation of ventilator monitoring alarms Anaesth. Intensive Care
14 174–85
Mylrea K, Orr J and Westenskow D 1993 Integration of monitoring for intelligent alarms in anesthesia: Neural
networks—can they help? J. Clin. Monit. 931–7
Nazir T and Beatty P 2000 Anaesthetists’attitudes to monitoring instrument design options Br. J. Anaesth. 85 781–4
Nieminen K, Langford R M, Morgan C, Takala J and Kari A 1997 A clinical description of the improve data library
IEEE Eng. Med. Biol. 21–4
Oberli C, Urzua J, Saez C, Guarini M, Cipriano A, Garayar B, Lema G, Canessa R, Sacco C and Irarrazaval M 1999
An expert system for monitor alarm integration J. Clin. Monit. Comput. 15 29–35
O’Carroll T 1986 Survey of alarms in an intensive therapy unit Anaesthesia 41 742–4
O’Hare D 2000 The ‘Wheel of Misfortune’: a taxonomic approach to human factors in accident investigation and
analysis in aviation and other complex systems Ergonomics 43 2001–19
Orr J A, HSimon F, Bender H-J and Westenskow D R 1990 Response time with smart alarms Anesthesiology 73 A447
Orr J A and Westenskowe D R 1990 Evaluation of a breathing system alarm based on neural networks Anesthesiology
73 A445
Patterson R D 1982 Guidelines for auditory warning systems on civil aircraft Civil Aviation Authority, London CAA
Paper 82017
R132 Topical Review
Patterson R D, Edworthy J, Shailer M J, Lower M C and Wheeler P D 1986 Alarm sounds for medical equipment
in intensive care areas and operating theatres Institute of Sound and Vibration Research, Southampton Report
AC598
Rasmussen J, Pedersen O, Mancini G, Griffon M and Gagnolet P 1981 Classification system for reporting events
involving human malfunctions, RISO-M-2240 (Roskilde, Denmark: Riso Nat. Lab.)
Rasmussen J, Pejtersen A and Goodstein L 1992 Cognitive Engineer ing—Concepts and Applications: Part 1. Concepts
vol 1 (New York: Wiley)
Rasmussen J and Vicente K J 1987 Cognitive control of human activities: the implications for ecological interface
design (Roskilde, Denemark: RISO Labs.)
Reason J 1987 Generic error modelling system (GEMS): a cognitive framework for locating common human error
forms New Technology and Human Error ed J Rasmussen, K Duncan and J Leplat (Chichester: Wiley) p 63
Reason J 1990 Human Error (Cambridge: Cambridge University Press)
Reason J 2000 Human error: models and management Br. Med. J. 320 768–70
Rheineck-Leyssius A T and Kalkman C J 1998 Influence of pulse oximeter settings on the frequency of alarms and
detection of hypoxemia—theoretical effects of artifact rejection, alarm delay, averaging, median filtering or a
lower setting of the alarm limit J. Clin. Monit. Comput. 14 151–6
Rheineck-Leyssius A T and Kalkman C J 1999a Advanced pulse oximeter signal processing technology compared to
simple averaging. Its effect on frequency of alarms in the postanesthesia care unit J. Clin. Anesth. 11 196–200
Rheineck-Leyssius R T and Kalkman C J 1999b Advanced pulse oximeter signal processing technology compared to
simple averaging: I. Effect on frequency of alarms in the operating room J. Clin. Anesth. 11 192–5
Rowe L, Galletly D and Henderson R 1992 Accuracy of text entries within a manually compiled anaesthetic record
Br. J. Anaesth. 68 381–7
Short T, O’Regan A, Lew J and Oh T 1992 Critical incident reporting in an anaesthetic department quality assurance
programme Anaesth. Intensive Care 47 3–7
Stanton N and Edworthy J 1998 Auditory affordances in the intensive treatment unit Appl. Ergonomics 29 389–94
Taylor-Adams S, Vincent C and Stanhope N 1999 Applying human factors methods to the investigation and analysis
of clinical adverse events Safety Science 31 143–59
The Advisory Working Group on Alarms on Clinical Monitors 1995 The report of the expert working group on
alarms on clinical monitor in response to recommendation 11 of the clothier report (the Allit inquiry) (London:
Medical Devices Agency)
Thrush D 1992 Automated anesthesia records and anesthetic incidents J. Clin. Monit. 859–61
van Oostrom J, van der Aa J, Nederstigt J, Beneken J and Gravnenstein J 1989 Intelligent alarms in the anaesthesia
circle breathing system Anesthesiology 71 A336
Vicente K and Rasmussen J 1992 Ecological interface design: theoretical foundations IEEE Trans. System Man.
Cybernet 22 589–606
Wallace M, Ashma M and Matjasko M 1994 Hearing acuity of anesthesiologists and alarm detection Anesthesiology
81 13–28
Watt R, Maslana E and Mylrea K 1993 Alarms and anesthesia IEEE Eng. Med. Biol. 34–41
Webb R, van der Walt J, Runciman W, Williamson J, Cockings J, Russell W and Helps S 1993 Which monitor? An
analysis of 2000 incident reports Anaesth. Intensive Care 21 529–42
Weinger M and Englund C 1990 Ergonomic and human factors affecting anesthetic vigilance and monitoring
performance in the operating room environment Anesthesiology 995–1021
Westenskow D R, Orr J A, Simon F H, Ing D, Bender H-J and Frankenburger H 1992 Intelligent alarms reduce
anesthesiologist’s response time to critical faults Anesthesiology 77 1074–9
Westenskow D, Loeb R, Brunner J and Pace N 1988 Expert alarms and autopilot in an anesthesia workstation
Anesthesiology 69 A731
Wickens C 1984 Engineering psychology and Human Performance (Columbus, OH: Charles E Merrill Publishing
Co.)
Wiegmann D A and Shappell S A 2001 Human error analysis of commercial aviation accidents: application of the
human factors analysis and classification system (HFACS) Aviat. Space Environ. Med. 72 1006–16
Wiklund L, Hok B, Stahl K and Jordeby-Jonsson A 1994 Postanesthesia monitoring revisited: Fre quency of true and
false alarms from different monitoring devices J. Clin. Anesth. 182–8
Williamson J, Webb R, Runciman W, Walt J V D and Sellen A 1993 Human failure: an analysis of 2000 incident
reports Anaesth. Intensive Care 21 678–83
Willson D 1994 Survey of nurse perceptions regarding the utilization of bedside computers Symposium of Computing
Applied to Medical Care (1994) pp 553–7
Wright D, Mackenzie S, Buchan I, Cairns C and Price L 1991 Critical incidents in the intensive therapy unit Lancet
338 676–8
Zotov D 1996 Reporting human factors accidents ISASI 29 4–20