Content uploaded by Nicholas Genes
Author content
All content in this area was uploaded by Nicholas Genes on Mar 24, 2015
Content may be subject to copyright.
AAbbssttrraacctt
As hospitals incorporate information technology
(IT), their operations become increasingly vulnerable
to technological breakdowns and attacks. Proper emer-
gency management and business continuity planning
require an approach to identify, mitigate, and work
through IT downtime. Hospitals can prepare for these
disasters by reviewing case studies. This case study
details the disruption of computer operations at Mount
Sinai Medical Center (MSMC), an urban academic
teaching hospital. The events, and MSMC’s response,
are narrated and the impact on hospital operations is
analyzed. MSMC’s disaster management strategy pre-
vented computer failure from compromising patient
care, although walkouts and time-to-disposition in the
emergency department (ED) notably increased. This
incident highlights the importance of disaster pre-
paredness and mitigation. It also demonstrates the
value of using operational data to evaluate hospital
responses to disasters. Quantifying normal hospital
functions, just as with a patient’s vital signs, may help
quantitatively evaluate and improve disaster manage-
ment and business continuity planning.
Key words: electronic health record, computer secu-
rity, medical informatics, disaster planning, hospital
administration
IInnttrroodduuccttiioonn
US hospitals increasingly rely on computers.
From 2008 to 2012, the fraction of US hospitals using
basic electronic health records (EHRs) increased
more than fourfold.1In 2012, more than three-quar-
ters of US hospitals reported some use of electronic
documentation for providers. This increased from 10
percent in 2005.2
Hospitals will increasingly rely on technology as
the delivery of medicine increases in complexity and
the need for quickly articulating vast clinical knowl-
edge as easily updated quantitative guidelines grows.3
No matter how expected or even inevitable, this
reliance is concerning, because clinical care must not
fail when its servers do. Regulatory standards do not
pardon a breach in patient care if a computer problem
caused it. Moreover, unscheduled downtime costs hos-
pitals a lot—perhaps $14 per bed for each minute of
downtime.4
To mitigate the impact of health information tech-
nology (IT) failures on patient care, it is useful to exam-
ine what happens during these failures. Let us learn
from history. However, waiting for the caprice of nature
to oblige us is futile, especially as the frequency and
duration of planned electronic system downtime
episodes declines.5Moreover, waiting for electronic dis-
aster breaks the best practices of emergency planning
and business continuity. Accordingly, some hospitals
conduct surprise downtime drills. For example, at
Intermountain Healthcare’s LDS hospital, IT and
emergency preparedness staff walk through wards and
review downtime procedures with hospital personnel.6
Those efforts notwithstanding, the effects of and
response to real disasters, as opposed to drills, have
been infrequently described. The CIO of Beth Israel
Deaconess Medical Center discussed a protracted net-
work outage in 2002 and its mitigation strategy.7
However, the best practices for recovering from cyber-
attacks on a healthcare institution are grafted from
www.disastermedicinejournal.com 11
An academic medical center’s response to widespread computer failure
Nicholas Genes, MD, PhD; Michael Chary, BS; Kevin W. Chason, DO
CASE STUDY
DOI:10.5055/ajdm.2013.0000
DM
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 1
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
guidelines created for enterprises that do not provide
healthcare.8,9
This article describes an unplanned hospital-wide
IT downtime resulting from disabled personal com-
puters (PCs). Briefly, on April 21, 2010, the antivirus
software company McAfee Inc. (Santa Clara, CA) sent
an erroneous update to users of McAfee VirusScan
who had also installed Windows XP Service Pack 3
(Microsoft Corporation, Redmond, WA). That “update”
either disconnected computers from the network or
caused them to reboot incessantly, rendering the com-
puters unusable for clinical care. This article outlines
Mount Sinai Medical Center (MSMC)’s recognition of
and response to the McAfee Antivirus Incident (MAI)
and then discusses its impact on hospital operations.
FFaacciilliittyy ddeessccrriippttiioonn
MSMC is a 1,171-bed academic medical center in
Manhattan, NY. It has affiliates in Queens and
Westchester that make use of MSMC resources for
diagnostic imaging, laboratory testing, and ancillary
services. During MAI, the hospital was at stage 2 on
the Health Information Management Systems Society
(HIMSS) scale for electronic medical record adoption.
That means MSMC had a clinical data repository, par-
ticipated in a regional health information organization
(RHIO) to which it electronically reported its lab
results and had a picture archiving and communica-
tion system (PACS) throughout the hospital. It also
had an all-electronic emergency department (ED)
information system.
MSMC maintains an active IT disaster recovery
program. It backs up data via tape and stores them at
two offsite locations and can recover those data within
2 days. One offsite location also maintains copies of
hospital applications. MSMC completely tests this
backup system every year.
MMeetthhooddss
DDaattaa ccoolllleeccttiioonn
Six days after MAI, those involved in the response
were debriefed. They drafted an After Action Report
(AAR). Some hospital departments also provided
usage statistics for the day of the incident, Wednesday,
April 21, 2010, and two Wednesdays before and one
Wednesday after. The ED provided its patient census
and hourly arrival rate. It also provided the average
length of stay, time to bed, to see a nurse, to see a doc-
tor, and to disposition or walking away from the ER.
The radiology department provided the number of
chest X-rays (CXRs), abdominal computed tomogra-
phy (CT) scans, and chest CTs. MSMC provided the
overall hospital census and discharge rate. The
departments of general surgery provided data about
operating room usage. This article does not discuss
that because those data were insufficient to make any
conclusions. Mount Sinai’s Institutional Review Board
granted all data collection an exemption from its
review.
DDaattaa aannaallyyssiiss
All data were plotted in Prism 4 for Macintosh by
GraphPad. As the data available were insufficient to
draw statistical inferences, no statistical analysis was
performed.
DDiissaasstteerr nnaarrrraattiivvee
The MAI began on April 21, 2010, at 9:50 AM EST
when a faulty McAfee file was downloaded and distrib-
uted to hospital PCs running Windows XP Service Pack 3
(see Figure 1). By 10:15 AM, some computers alerted
users with a pop-up that a virus (W32/Wecorl.a) infecting
the svchost.exe file had been detected. Users were given
an option to clean or delete that file. Either option led to
the PC shutting down, rebooting, detecting a corrupt
svchost.exe file before Windows fully loaded, alerting the
user, shutting down and so on—in an infinite loop.
IT recorded the first help desk ticket regarding MAI
at 10:20 AM. By 10:30, IT had received enough calls to
declare a level 1 activation, triggering a notification to
hospital leadership of an anticipated disruptive event
(level 1 activations are often used for severe weather
forecasts, and are the lowest of the four levels). The
entire IT department met, notified operations, and
established emergency operations command center at
11 AM. At that time, level 2 status was declared,
acknowledging a minimal impact on hospital opera-
tions that nonetheless requires coordinated interde-
partmental response. The Hospital Incident Command
American Journal of Disaster Medicine
, Vol. 8, No. 1, Winter 2013
22
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 2
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
System (HICS) was activated and personnel with
Command Center responsibilities were notified via
MessageOne/AlertFind, MSMC’s mass notification sys-
tem. The system distributed voice messages to person-
nel work, home, and cell numbers (it can also send
messages via short message service or e-mail). IT also
readied two-way radios and called McAfee for guidance.
At 11:30 AM, HICS upgraded MAI to level 3 status,indi-
cating disruption of operations throughout the hospital
but that normal patient care activities would continue.
Level 4 status—indicating normal patient care activi-
ties were interrupted and all resources are devoted to
emergency operations—was never reached during MAI.
At 11:15 AM, McAfee informed IT that MAI came
from its software falsely identifying a system file as a
virus. By 11:30 AM, Desktop Support began rebuilding
affected clinical PCs. It began in the ED. To mitigate
damage, at 11:55 AM, a broadcast message was e-mailed
to all users, asking them to implement local downtime
procedures and shut off all computers.The IT HelpDesk
began recommending this practice as well.
With the implementation of “downtime proce-
dures” throughout the institution, legacy forms were
used to document notes and place orders for patient
care. Patient bed management and tracking on units,
which was no longer possible on local workstations,
moved to dry-erase whiteboards in some areas. Paper
logs were maintained on all units, to capture patient
movements and to ensure that hospital throughput
was maintained.
At 12:35 PM, Desktop Engineering announced a
solution: roll back the affected McAfee file and reboot
from a compact disc (CD). Hospital operations priori-
tized the distribution of CDs. At 1 PM, engineers, again
in the ED, successfully rescued the first falsely
infected PC. One PC per ED clinical care location was
restored. An auto-booting CD was developed by 3 PM,
and IT spread through the hospital to rescue the PCs.
To provide additional staff, technically competent
medical students were solicited. Teams of 7-15 can-
vassed the hospital, beginning with clinical areas, to
apply the patch. Security traveled with the teams,
when needed, to open locked office doors (it was
believed that restoring the PCs behind locked doors
was more efficient than logging offices and areas to
return to later).
www.disastermedicinejournal.com 33
Figure 1. Timeline of Windows PC disturbance at MSMC on April 21, 2010.
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 3
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
For the next 6 hours, the teams roved the hospital.
The night shift was informed of the incident and its
fallout, and was told to continue using paper documen-
tation and ordering. By 9:30 PM, most of the 3,200
affected PCs had been restored.
Overnight IT staff continued to reboot affected
workstations. By 7 AM on Thursday,arriving staff were
advised it was safe to begin using computers that had
been shut off.Additional PC restorations were done on
a case-by-case basis. At 4 PM Thursday, it was decided
that no further resources were needed and so the
Hospital Emergency Operations Command Center
was demobilized, and the medical center returned to
normal operations.
AAnnaallyyssiiss
AAfftteerr aaccttiioonn rreeppoorrtt
The AAR suggested that having multiple inde-
pendent means of communication, such as cellphones,
two-way radios and telephone was valuable. Keeping
these devices ready requires foresight but is not pro-
hibitively expensive.
The AAR also noted that several departments
lacked the supplies to keep paper records. Perhaps
because of inadequate paper supplies, these depart-
ments were also unable to tabulate usage statistics
during the downtime, which limited efforts to quantify
the impact of MAI.
Further comments in the AAR noted that, while
the laboratory system remained functional, results
had to be hand-delivered, faxed, read over the phone to
providers, or directed to still-functioning networked
printers on units. Laboratory personnel found this
process of relaying results to be labor-intensive and
temporarily diverted staff from routine outpatient lab
processing to help handle ED and inpatient needs.
CDs were distributed quickly to volunteers, as
they were created. The number of discs made and the
names of the volunteers who received discs were not
logged, which made it was difficult to recover the discs
from the volunteers who distributed them. Inadequate
logging of disc creation and distribution itself creates
a security vulnerability because they could be later
used to compromise workstations.
OOppeerraattiioonnss ddaattaa
Figure 2 shows the average time in minutes for key
ED milestones on Wednesdays in April, 2010: the time
needed for a patient to be placed in a bed, to be evalu-
ated by an RN,an MD,the time needed to come to a dis-
position decision, and the total length of stay in the ED.
All metrics on the day of MAI (April 21) were notice-
ably longer. However, total ED lab orders and lab
orders per ED patient were relatively similar through
all the dates examined (not shown). More broadly,
insignificantly fewer chemistry and hematology labs
were processed throughout the hospital (not shown).
Figure 3 shows number of radiology department
studies processed for CXRs and abdominal CTs. More
CXRs and abdominal CTs were performed on April 21
compared to other days. Although MAI was not associ-
ated with a significant overall change in the pattern of
ordering of imaging studies, between noon and 4 PM on
April 21, just 36 CXRs were performed, in contrast to
47, 60, and 56 during the same time period on the sur-
rounding Wednesdays. Abdominal CTs showed no
reciprocal change.
Surgical operations on April 21 that began before
or after the primary PC downtime associated with
MAI, that is 10 AM to 4 PM for OR computers, took
slightly longer—an average of 3 hours 16 minutes
compared to 2 hours 28 minutes, 2 hours 51 minutes,
and 2 hours 53 minutes for the surrounding
Wednesdays (not shown). For cases finishing outside
American Journal of Disaster Medicine
, Vol. 8, No. 1, Winter 2013
44
Figure 2. Effect of MAI on ED throughput metrics.
The time (minutes) to key milestones in a patient visit
is reported for Wednesdays in April, 2010.
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 4
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
that window the OR time was similarly longer—4
hours 5 minutes compared to 3 hours 16 minutes, 3
hours 34 minutes, and 3 hours 25 minutes on the sur-
rounding Wednesdays. For all Wednesdays, the overall
case load was comparable.
DDiissccuussssiioonn
MAI resembled a cyberattack. McAfee’s error dis-
abled computers and disrupted operations at hospitals
and institutions around the world, including at
MSMC.10 MSMC recognized the incident and coordi-
nated among emergency management, hospital lead-
ership, IT, and the hospital community to respond
rapidly and effectively.
From what was learned this episode, MSMC has
launched an emergency management committee that
has implemented notification protocols to respond to
future IT failures. The committee has created admin-
istrative policy to guide hospital operations during IT
disruptions, to ensure business continuity. The policy
standardizes the location of downtime forms, supplies,
and recovery of information when IT systems return
to availability.
Future plans for widespread computer failure will
involve logging the creation and distribution of boot
discs to ensure complete recovery, offsite and local data
backups to enable safety of patient data and speedy
resumption of activities after a downtime, and drills to
better train staff on downtime procedures. Furthermore,
baseline operations data across the enterprise is now
routinely collected, so deviations from baseline (such as
those that might occur during a cyberattack or other
downtime) can be more easily determined.
In the event of protracted IT downtime, plans have
been drawn up based on historical paper-based
patient care procedures, including the process for
recovery and integrating paper documentation back
into electronic systems, once restored. The institution
operates redundant data centers and maintains appli-
cation availability offsite to protect the integrity of
data and make applications quickly available from
other locations, and also has agreements with vendors
to rebuild data center and applications if needed. For
system outages that are longer than a few days, meet-
ing the immediate needs of patients becomes a matter
of business continuity planning that all organizations
should investigate more seriously; especially after
more descriptions of events like MAI are reported.
However, sweeping changes in MSMC’s operations
may limit how much MAI can inform future disaster
mitigation planning. In 2011, MSMC implemented
electronic health records (EHRs) throughout the hos-
pital. This included computerized physician order
entry, closed-loop medication administration, physi-
cian and nursing documentation, and clinical decision
support. Thus, EHR adoption may complicate down-
time procedures because it means, in an emergency,
more people are forced to switch to an increasingly
unfamiliar system. The institution recognizes this
potential pitfall and is providing downtime education
to providers and ancillary staff, and conducting drills
in patient care locations, to reinforce the competency
of documenting and placing orders using paper.
Perhaps the increased vulnerability to cyberat-
tacks that comes with adopting EHR explains why
MAI affected the ED more than the radiology depart-
ment. At the time, the ED was completely electronic,
while departments like radiology were not. The
notably longer time for ED dispositions suggests a
prolongation in determining the appropriate care for a
www.disastermedicinejournal.com 55
Figure 3. Effect of MAI on number of CXRs and
abdominal CT studies performed on Wednesdays in
April 2010.
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 5
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
patient. Alternatively, that prolongation could come
from inaccurate data collection associated with unfa-
miliar paper records. The higher patient walkout rate
supports the notion that care determinations were
protracted. Compared with other weeks, 11 more
patients walked out on the day of MAI, an increase
that would not be expected to occur by chance more
than 1 percent of the time.
Data from radiology suggest a similar, if more mud-
dled, picture. That fewer CXRs were performed sug-
gests that MAI affected care. However, abdominal CTs
proceeded apace. Perhaps, to prioritize, the more critical
CT scan was preferred over the often-routine preadmis-
sion CXR. Another explanation could be that many
CXRs are ordered from the ED, so the dip in CXRs could
just reflect interruption of ED computer systems rather
than a conscious reallocation of resources.
LLiimmiittaattiioonnss
This study presents a narrative of the events of 1
day and compares it to the same day of the week in
surrounding weeks. The data collected from that day
are spotty and do not capture all hospital operations.
Some data were collected retrospectively and there
was no standardized reporting format. Furthermore,
our analysis assumes that the surrounding weeks are
typical examples of hospital operations. No compara-
ble data from other hospitals or other times of the year
was available to verify that assumption or account for
confounders such as seasonal variations or personnel
turnover.
CCoonncclluussiioonnss
Ultimately, the McAfee Antivirus Incidence
caused a major disruption with minimal long-term
adverse impact for MSMC. Perhaps rapid notification
and response limited its impact. The data suggest that
the ED performance benchmark of time-to-disposition
was most affected. This may bode ill as hospitals
embrace EHR. The analysis of this episode suggests
that quantitatively assessing patient throughput in
different departments on a regular basis will help the
disaster management community better contextualize
and learn from disruptions.
Nicholas Genes, MD, PhD, Assistant Professor, Department of
Emergency Medicine, Icahn School of Medicine at Mount Sinai,
New York, New York.
Michael Chary, PhD, Icahn School of Medicine at Mount Sinai, New
York, New York.
Kevin W. Chason, DO, Assistant Professor, Department of Emer-
gency Medicine, Icahn School of Medicine at Mount Sinai, New
York, New York.
RReeffeerreenncceess
1. Charles D, King J, Patel V, et al.:Adoption of Electronic Health
Record Systems among U.S. Non-federal Acute Care Hospitals:
2008-2012. 2013. ONC Data Brief. Available at http://www.healthit.
gov/sites/default/files/oncdatabrief9final.pdf.Accessed August 1, 2013
2. Wise P: Getting to meaningful use. Paper presented at Annual
Meeting of Society for Academic Emergency Medicine, Phoenix,AZ,
June 3-6, 2010.
3. Hoot N,Wright JC, Aronsky D: Factors contributing to computer
systems downtime in the emergency department. Paper presented
at AMIA Annual Symposium, Washington, DC, November 8-12,
2003.
4. Anderson M:The toll of downtime. Healthc Inform. 2002; 19(4):
27-30.
5. Valenstein P,Walsh M: Six-year trends in laboratory computer
availability. Arch Pathol Lab Med. 2003; 127(2): 157-161.
6. Nelson MC: Downtime procedure for a clinical information sys-
tem: A critical issue.J Crit Care. 2007; 22(1): 45-50.
7. Beratino S: Halamka on Beth Israel’s health-care IT disaster.
CIO. 2003. Available at http://www.cio.com/article/31701/Halamka_
on_Beth_Israel_s_Health_Care_IT_Disaster. Accessed June 1, 2013.
8. Losefsky W: The efficacy of best practices in recovery from cyber-
attacks. J Health Prot Manag. 2012; 28(1): 104-107.
9. Haugh R: Cyberterror. Hosp Health Netw. 2003; 77: 6-7.
10. Ragan S: Mcafee aftermath-impact numbers and recovery
resources remain. The Tech Herald. 2010. Available at http://www.
thetechherald.com/articles/McAfee-aftermath-impact-numbers-and-
recovery-resources-remain/10002/. Accessed June 1, 2013.
American Journal of Disaster Medicine
, Vol. 8, No. 1, Winter 2013
66
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 6
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE