ArticlePDF Available

Abstract

As hospitals incorporate information technology (IT), their operations become increasingly vulnerable to technological breakdowns and attacks. Proper emergency management and business continuity planning require an approach to identify, mitigate, and work through IT downtime. Hospitals can prepare for these disasters by reviewing case studies. This case study details the disruption of computer operations at Mount Sinai Medical Center (MSMC), an urban academic teaching hospital. The events, and MSMC's response, are narrated and the impact on hospital operations is analyzed. MSMC's disaster management strategy prevented computer failure from compromising patient care, although walkouts and time-to-disposition in the emergency department (ED) notably increased. This incident highlights the importance of disaster preparedness and mitigation. It also demonstrates the value of using operational data to evaluate hospital responses to disasters. Quantifying normal hospital functions, just as with a patient's vital signs, may help quantitatively evaluate and improve disaster management and business continuity planning.
AAbbssttrraacctt
As hospitals incorporate information technology
(IT), their operations become increasingly vulnerable
to technological breakdowns and attacks. Proper emer-
gency management and business continuity planning
require an approach to identify, mitigate, and work
through IT downtime. Hospitals can prepare for these
disasters by reviewing case studies. This case study
details the disruption of computer operations at Mount
Sinai Medical Center (MSMC), an urban academic
teaching hospital. The events, and MSMC’s response,
are narrated and the impact on hospital operations is
analyzed. MSMC’s disaster management strategy pre-
vented computer failure from compromising patient
care, although walkouts and time-to-disposition in the
emergency department (ED) notably increased. This
incident highlights the importance of disaster pre-
paredness and mitigation. It also demonstrates the
value of using operational data to evaluate hospital
responses to disasters. Quantifying normal hospital
functions, just as with a patient’s vital signs, may help
quantitatively evaluate and improve disaster manage-
ment and business continuity planning.
Key words: electronic health record, computer secu-
rity, medical informatics, disaster planning, hospital
administration
IInnttrroodduuccttiioonn
US hospitals increasingly rely on computers.
From 2008 to 2012, the fraction of US hospitals using
basic electronic health records (EHRs) increased
more than fourfold.1In 2012, more than three-quar-
ters of US hospitals reported some use of electronic
documentation for providers. This increased from 10
percent in 2005.2
Hospitals will increasingly rely on technology as
the delivery of medicine increases in complexity and
the need for quickly articulating vast clinical knowl-
edge as easily updated quantitative guidelines grows.3
No matter how expected or even inevitable, this
reliance is concerning, because clinical care must not
fail when its servers do. Regulatory standards do not
pardon a breach in patient care if a computer problem
caused it. Moreover, unscheduled downtime costs hos-
pitals a lot—perhaps $14 per bed for each minute of
downtime.4
To mitigate the impact of health information tech-
nology (IT) failures on patient care, it is useful to exam-
ine what happens during these failures. Let us learn
from history. However, waiting for the caprice of nature
to oblige us is futile, especially as the frequency and
duration of planned electronic system downtime
episodes declines.5Moreover, waiting for electronic dis-
aster breaks the best practices of emergency planning
and business continuity. Accordingly, some hospitals
conduct surprise downtime drills. For example, at
Intermountain Healthcare’s LDS hospital, IT and
emergency preparedness staff walk through wards and
review downtime procedures with hospital personnel.6
Those efforts notwithstanding, the effects of and
response to real disasters, as opposed to drills, have
been infrequently described. The CIO of Beth Israel
Deaconess Medical Center discussed a protracted net-
work outage in 2002 and its mitigation strategy.7
However, the best practices for recovering from cyber-
attacks on a healthcare institution are grafted from
www.disastermedicinejournal.com 11
An academic medical center’s response to widespread computer failure
Nicholas Genes, MD, PhD; Michael Chary, BS; Kevin W. Chason, DO
CASE STUDY
DOI:10.5055/ajdm.2013.0000
DM
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 1
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
guidelines created for enterprises that do not provide
healthcare.8,9
This article describes an unplanned hospital-wide
IT downtime resulting from disabled personal com-
puters (PCs). Briefly, on April 21, 2010, the antivirus
software company McAfee Inc. (Santa Clara, CA) sent
an erroneous update to users of McAfee VirusScan
who had also installed Windows XP Service Pack 3
(Microsoft Corporation, Redmond, WA). That “update”
either disconnected computers from the network or
caused them to reboot incessantly, rendering the com-
puters unusable for clinical care. This article outlines
Mount Sinai Medical Center (MSMC)’s recognition of
and response to the McAfee Antivirus Incident (MAI)
and then discusses its impact on hospital operations.
FFaacciilliittyy ddeessccrriippttiioonn
MSMC is a 1,171-bed academic medical center in
Manhattan, NY. It has affiliates in Queens and
Westchester that make use of MSMC resources for
diagnostic imaging, laboratory testing, and ancillary
services. During MAI, the hospital was at stage 2 on
the Health Information Management Systems Society
(HIMSS) scale for electronic medical record adoption.
That means MSMC had a clinical data repository, par-
ticipated in a regional health information organization
(RHIO) to which it electronically reported its lab
results and had a picture archiving and communica-
tion system (PACS) throughout the hospital. It also
had an all-electronic emergency department (ED)
information system.
MSMC maintains an active IT disaster recovery
program. It backs up data via tape and stores them at
two offsite locations and can recover those data within
2 days. One offsite location also maintains copies of
hospital applications. MSMC completely tests this
backup system every year.
MMeetthhooddss
DDaattaa ccoolllleeccttiioonn
Six days after MAI, those involved in the response
were debriefed. They drafted an After Action Report
(AAR). Some hospital departments also provided
usage statistics for the day of the incident, Wednesday,
April 21, 2010, and two Wednesdays before and one
Wednesday after. The ED provided its patient census
and hourly arrival rate. It also provided the average
length of stay, time to bed, to see a nurse, to see a doc-
tor, and to disposition or walking away from the ER.
The radiology department provided the number of
chest X-rays (CXRs), abdominal computed tomogra-
phy (CT) scans, and chest CTs. MSMC provided the
overall hospital census and discharge rate. The
departments of general surgery provided data about
operating room usage. This article does not discuss
that because those data were insufficient to make any
conclusions. Mount Sinai’s Institutional Review Board
granted all data collection an exemption from its
review.
DDaattaa aannaallyyssiiss
All data were plotted in Prism 4 for Macintosh by
GraphPad. As the data available were insufficient to
draw statistical inferences, no statistical analysis was
performed.
DDiissaasstteerr nnaarrrraattiivvee
The MAI began on April 21, 2010, at 9:50 AM EST
when a faulty McAfee file was downloaded and distrib-
uted to hospital PCs running Windows XP Service Pack 3
(see Figure 1). By 10:15 AM, some computers alerted
users with a pop-up that a virus (W32/Wecorl.a) infecting
the svchost.exe file had been detected. Users were given
an option to clean or delete that file. Either option led to
the PC shutting down, rebooting, detecting a corrupt
svchost.exe file before Windows fully loaded, alerting the
user, shutting down and so on—in an infinite loop.
IT recorded the first help desk ticket regarding MAI
at 10:20 AM. By 10:30, IT had received enough calls to
declare a level 1 activation, triggering a notification to
hospital leadership of an anticipated disruptive event
(level 1 activations are often used for severe weather
forecasts, and are the lowest of the four levels). The
entire IT department met, notified operations, and
established emergency operations command center at
11 AM. At that time, level 2 status was declared,
acknowledging a minimal impact on hospital opera-
tions that nonetheless requires coordinated interde-
partmental response. The Hospital Incident Command
American Journal of Disaster Medicine
, Vol. 8, No. 1, Winter 2013
22
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 2
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
System (HICS) was activated and personnel with
Command Center responsibilities were notified via
MessageOne/AlertFind, MSMC’s mass notification sys-
tem. The system distributed voice messages to person-
nel work, home, and cell numbers (it can also send
messages via short message service or e-mail). IT also
readied two-way radios and called McAfee for guidance.
At 11:30 AM, HICS upgraded MAI to level 3 status,indi-
cating disruption of operations throughout the hospital
but that normal patient care activities would continue.
Level 4 status—indicating normal patient care activi-
ties were interrupted and all resources are devoted to
emergency operations—was never reached during MAI.
At 11:15 AM, McAfee informed IT that MAI came
from its software falsely identifying a system file as a
virus. By 11:30 AM, Desktop Support began rebuilding
affected clinical PCs. It began in the ED. To mitigate
damage, at 11:55 AM, a broadcast message was e-mailed
to all users, asking them to implement local downtime
procedures and shut off all computers.The IT HelpDesk
began recommending this practice as well.
With the implementation of “downtime proce-
dures” throughout the institution, legacy forms were
used to document notes and place orders for patient
care. Patient bed management and tracking on units,
which was no longer possible on local workstations,
moved to dry-erase whiteboards in some areas. Paper
logs were maintained on all units, to capture patient
movements and to ensure that hospital throughput
was maintained.
At 12:35 PM, Desktop Engineering announced a
solution: roll back the affected McAfee file and reboot
from a compact disc (CD). Hospital operations priori-
tized the distribution of CDs. At 1 PM, engineers, again
in the ED, successfully rescued the first falsely
infected PC. One PC per ED clinical care location was
restored. An auto-booting CD was developed by 3 PM,
and IT spread through the hospital to rescue the PCs.
To provide additional staff, technically competent
medical students were solicited. Teams of 7-15 can-
vassed the hospital, beginning with clinical areas, to
apply the patch. Security traveled with the teams,
when needed, to open locked office doors (it was
believed that restoring the PCs behind locked doors
was more efficient than logging offices and areas to
return to later).
www.disastermedicinejournal.com 33
Figure 1. Timeline of Windows PC disturbance at MSMC on April 21, 2010.
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 3
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
For the next 6 hours, the teams roved the hospital.
The night shift was informed of the incident and its
fallout, and was told to continue using paper documen-
tation and ordering. By 9:30 PM, most of the 3,200
affected PCs had been restored.
Overnight IT staff continued to reboot affected
workstations. By 7 AM on Thursday,arriving staff were
advised it was safe to begin using computers that had
been shut off.Additional PC restorations were done on
a case-by-case basis. At 4 PM Thursday, it was decided
that no further resources were needed and so the
Hospital Emergency Operations Command Center
was demobilized, and the medical center returned to
normal operations.
AAnnaallyyssiiss
AAfftteerr aaccttiioonn rreeppoorrtt
The AAR suggested that having multiple inde-
pendent means of communication, such as cellphones,
two-way radios and telephone was valuable. Keeping
these devices ready requires foresight but is not pro-
hibitively expensive.
The AAR also noted that several departments
lacked the supplies to keep paper records. Perhaps
because of inadequate paper supplies, these depart-
ments were also unable to tabulate usage statistics
during the downtime, which limited efforts to quantify
the impact of MAI.
Further comments in the AAR noted that, while
the laboratory system remained functional, results
had to be hand-delivered, faxed, read over the phone to
providers, or directed to still-functioning networked
printers on units. Laboratory personnel found this
process of relaying results to be labor-intensive and
temporarily diverted staff from routine outpatient lab
processing to help handle ED and inpatient needs.
CDs were distributed quickly to volunteers, as
they were created. The number of discs made and the
names of the volunteers who received discs were not
logged, which made it was difficult to recover the discs
from the volunteers who distributed them. Inadequate
logging of disc creation and distribution itself creates
a security vulnerability because they could be later
used to compromise workstations.
OOppeerraattiioonnss ddaattaa
Figure 2 shows the average time in minutes for key
ED milestones on Wednesdays in April, 2010: the time
needed for a patient to be placed in a bed, to be evalu-
ated by an RN,an MD,the time needed to come to a dis-
position decision, and the total length of stay in the ED.
All metrics on the day of MAI (April 21) were notice-
ably longer. However, total ED lab orders and lab
orders per ED patient were relatively similar through
all the dates examined (not shown). More broadly,
insignificantly fewer chemistry and hematology labs
were processed throughout the hospital (not shown).
Figure 3 shows number of radiology department
studies processed for CXRs and abdominal CTs. More
CXRs and abdominal CTs were performed on April 21
compared to other days. Although MAI was not associ-
ated with a significant overall change in the pattern of
ordering of imaging studies, between noon and 4 PM on
April 21, just 36 CXRs were performed, in contrast to
47, 60, and 56 during the same time period on the sur-
rounding Wednesdays. Abdominal CTs showed no
reciprocal change.
Surgical operations on April 21 that began before
or after the primary PC downtime associated with
MAI, that is 10 AM to 4 PM for OR computers, took
slightly longer—an average of 3 hours 16 minutes
compared to 2 hours 28 minutes, 2 hours 51 minutes,
and 2 hours 53 minutes for the surrounding
Wednesdays (not shown). For cases finishing outside
American Journal of Disaster Medicine
, Vol. 8, No. 1, Winter 2013
44
Figure 2. Effect of MAI on ED throughput metrics.
The time (minutes) to key milestones in a patient visit
is reported for Wednesdays in April, 2010.
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 4
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
that window the OR time was similarly longer—4
hours 5 minutes compared to 3 hours 16 minutes, 3
hours 34 minutes, and 3 hours 25 minutes on the sur-
rounding Wednesdays. For all Wednesdays, the overall
case load was comparable.
DDiissccuussssiioonn
MAI resembled a cyberattack. McAfee’s error dis-
abled computers and disrupted operations at hospitals
and institutions around the world, including at
MSMC.10 MSMC recognized the incident and coordi-
nated among emergency management, hospital lead-
ership, IT, and the hospital community to respond
rapidly and effectively.
From what was learned this episode, MSMC has
launched an emergency management committee that
has implemented notification protocols to respond to
future IT failures. The committee has created admin-
istrative policy to guide hospital operations during IT
disruptions, to ensure business continuity. The policy
standardizes the location of downtime forms, supplies,
and recovery of information when IT systems return
to availability.
Future plans for widespread computer failure will
involve logging the creation and distribution of boot
discs to ensure complete recovery, offsite and local data
backups to enable safety of patient data and speedy
resumption of activities after a downtime, and drills to
better train staff on downtime procedures. Furthermore,
baseline operations data across the enterprise is now
routinely collected, so deviations from baseline (such as
those that might occur during a cyberattack or other
downtime) can be more easily determined.
In the event of protracted IT downtime, plans have
been drawn up based on historical paper-based
patient care procedures, including the process for
recovery and integrating paper documentation back
into electronic systems, once restored. The institution
operates redundant data centers and maintains appli-
cation availability offsite to protect the integrity of
data and make applications quickly available from
other locations, and also has agreements with vendors
to rebuild data center and applications if needed. For
system outages that are longer than a few days, meet-
ing the immediate needs of patients becomes a matter
of business continuity planning that all organizations
should investigate more seriously; especially after
more descriptions of events like MAI are reported.
However, sweeping changes in MSMC’s operations
may limit how much MAI can inform future disaster
mitigation planning. In 2011, MSMC implemented
electronic health records (EHRs) throughout the hos-
pital. This included computerized physician order
entry, closed-loop medication administration, physi-
cian and nursing documentation, and clinical decision
support. Thus, EHR adoption may complicate down-
time procedures because it means, in an emergency,
more people are forced to switch to an increasingly
unfamiliar system. The institution recognizes this
potential pitfall and is providing downtime education
to providers and ancillary staff, and conducting drills
in patient care locations, to reinforce the competency
of documenting and placing orders using paper.
Perhaps the increased vulnerability to cyberat-
tacks that comes with adopting EHR explains why
MAI affected the ED more than the radiology depart-
ment. At the time, the ED was completely electronic,
while departments like radiology were not. The
notably longer time for ED dispositions suggests a
prolongation in determining the appropriate care for a
www.disastermedicinejournal.com 55
Figure 3. Effect of MAI on number of CXRs and
abdominal CT studies performed on Wednesdays in
April 2010.
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 5
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
patient. Alternatively, that prolongation could come
from inaccurate data collection associated with unfa-
miliar paper records. The higher patient walkout rate
supports the notion that care determinations were
protracted. Compared with other weeks, 11 more
patients walked out on the day of MAI, an increase
that would not be expected to occur by chance more
than 1 percent of the time.
Data from radiology suggest a similar, if more mud-
dled, picture. That fewer CXRs were performed sug-
gests that MAI affected care. However, abdominal CTs
proceeded apace. Perhaps, to prioritize, the more critical
CT scan was preferred over the often-routine preadmis-
sion CXR. Another explanation could be that many
CXRs are ordered from the ED, so the dip in CXRs could
just reflect interruption of ED computer systems rather
than a conscious reallocation of resources.
LLiimmiittaattiioonnss
This study presents a narrative of the events of 1
day and compares it to the same day of the week in
surrounding weeks. The data collected from that day
are spotty and do not capture all hospital operations.
Some data were collected retrospectively and there
was no standardized reporting format. Furthermore,
our analysis assumes that the surrounding weeks are
typical examples of hospital operations. No compara-
ble data from other hospitals or other times of the year
was available to verify that assumption or account for
confounders such as seasonal variations or personnel
turnover.
CCoonncclluussiioonnss
Ultimately, the McAfee Antivirus Incidence
caused a major disruption with minimal long-term
adverse impact for MSMC. Perhaps rapid notification
and response limited its impact. The data suggest that
the ED performance benchmark of time-to-disposition
was most affected. This may bode ill as hospitals
embrace EHR. The analysis of this episode suggests
that quantitatively assessing patient throughput in
different departments on a regular basis will help the
disaster management community better contextualize
and learn from disruptions.
Nicholas Genes, MD, PhD, Assistant Professor, Department of
Emergency Medicine, Icahn School of Medicine at Mount Sinai,
New York, New York.
Michael Chary, PhD, Icahn School of Medicine at Mount Sinai, New
York, New York.
Kevin W. Chason, DO, Assistant Professor, Department of Emer-
gency Medicine, Icahn School of Medicine at Mount Sinai, New
York, New York.
RReeffeerreenncceess
1. Charles D, King J, Patel V, et al.:Adoption of Electronic Health
Record Systems among U.S. Non-federal Acute Care Hospitals:
2008-2012. 2013. ONC Data Brief. Available at http://www.healthit.
gov/sites/default/files/oncdatabrief9final.pdf.Accessed August 1, 2013
2. Wise P: Getting to meaningful use. Paper presented at Annual
Meeting of Society for Academic Emergency Medicine, Phoenix,AZ,
June 3-6, 2010.
3. Hoot N,Wright JC, Aronsky D: Factors contributing to computer
systems downtime in the emergency department. Paper presented
at AMIA Annual Symposium, Washington, DC, November 8-12,
2003.
4. Anderson M:The toll of downtime. Healthc Inform. 2002; 19(4):
27-30.
5. Valenstein P,Walsh M: Six-year trends in laboratory computer
availability. Arch Pathol Lab Med. 2003; 127(2): 157-161.
6. Nelson MC: Downtime procedure for a clinical information sys-
tem: A critical issue.J Crit Care. 2007; 22(1): 45-50.
7. Beratino S: Halamka on Beth Israel’s health-care IT disaster.
CIO. 2003. Available at http://www.cio.com/article/31701/Halamka_
on_Beth_Israel_s_Health_Care_IT_Disaster. Accessed June 1, 2013.
8. Losefsky W: The efficacy of best practices in recovery from cyber-
attacks. J Health Prot Manag. 2012; 28(1): 104-107.
9. Haugh R: Cyberterror. Hosp Health Netw. 2003; 77: 6-7.
10. Ragan S: Mcafee aftermath-impact numbers and recovery
resources remain. The Tech Herald. 2010. Available at http://www.
thetechherald.com/articles/McAfee-aftermath-impact-numbers-and-
recovery-resources-remain/10002/. Accessed June 1, 2013.
American Journal of Disaster Medicine
, Vol. 8, No. 1, Winter 2013
66
AJDM_7-0_00-Genes-130014.qxd 8/6/13 5:21 PM Page 6
PROOF COPY ONLY
DO NOT DISTRIBUTE
PROOF COPY ONLY
DO NOT DISTRIBUTE
... Cyberbreaches are an increasing threat to iEHR worldwide, resulting in inaccessibility of critical clinical information and functions (Clarke & Youngstein, 2017;Healthcare Information and Management Systems Society, 2019). Facilities are also susceptible to sudden ICT outages due to internal disruptions such as hardware failures (Coffey, Postal, Houston, & McKeeby, 2016), software bugs (Genes, Chary, & Chason, 2013) and failed ITC upgrades (Wretborn, Ekelund, & Wilhelms, 2019). Unexpected system outages pose a threat to patient safety, potential loss or compromise of data, and disruptions to continuity of health care delivery (Harrison, Siwani, Pickering, & Herasevich, 2019;Larsen, Fong, Wernz, & Ratwani, 2017;Wang et al., 2016;Wretborn et al., 2019). ...
... Communication is acknowledged as a critical challenge and key focus for improvement by health facility stakeholders who have experienced unplanned downtime events (Coffey et al., 2016;Genes et al., 2013;Larsen et al., 2019;Primeau, 2018). Information sharing, a reciprocal process of sending and receiving information, predicts effective team performance (Mesmer-Magnus & DeChurch, 2009). ...
... This led to confusion and falsestarts but ensured solutions were tested and safe for patients. Other hospitals reported an interim period of instability while the source of iEHR disruptions were identified and solutions provided (Coffey et al., 2016;Genes et al., 2013). In the current study, concurrent use of digital and paper records introduced the risk of missing or duplicated documentation, particularly in relation to medication. ...
Article
Background There are few descriptions of management of unplanned hospital-wide digital downtime and impact on patient care in health literature. Aim The aim of this study was to undertake a qualitative review of a prolonged critical technology downtime event in an Australian hospital in 2017. Methods Inductive content analysis was conducted on data collected through face-to-face, semi-structured, individual interviews conducted with nine hospitals employees (five nurses with direct-care/operational responsibilities, and four executive staff, including nursing) who played a role in the incident. Findings Analysis of the data using an open-source R package led to the extraction of 139 codes, 13 first-level categories, and 4 main categories. Main categories extracted were: impact of event, response to the event, resilience and institutional reserve, and challenges and learnings. Discussion The overall experience for interview participants was positive. Effective communication methods, particularly vertical communication, enabled multi-disciplinary teams (comprising nursing, medical and pharmacy personnel) to safely transition back from downtime paper records to the integrated electronic health record with no harm to patients. Participants identified teamwork contributed to a sense of comradery with clinical colleagues and executive staff. Contingency planning and training are essential for ensuring safe and effective management of technology downtime events. Conclusion The prolonged digital disruption and subsequent recovery was managed effectively using a face-to-face communication and support approach. This approach reduced the impact of the digital downtime and ensured patient safety. The data analysis strategy was enhanced using an computer-assisted qualitative data analysis software.
... Genes et al. 38 After-action reporting of a medical center's response to an unplanned EHR downtime. ...
... Future studies should focus on and utilize the experiential knowledge that most, if not all, hospitals possess from downtimes. This experiential knowledge may be obtained using techniques similar to the after-action reporting analysis from Genes et al. 38 and Little et al., 43 the general historical review from Bulson et al., 41 or the survey data collection from Sittig et al. 39 Continued research into the abstract concept of downtime events without regarding the available concrete experiences misses an opportunity to expand our knowledge of downtime. Similarly, in order to move beyond the accepted belief that downtime events impact safety and performance in general, analyzing records of prior events can identify specific areas of the hospital that are heavily impacted and, therefore, should be the focus for improvement efforts. ...
Full-text available
Article
Electronic health record downtimes are any period where the computer systems are unavailable, either for planned or unexpected events. During an unexpected downtime, healthcare workers are rapidly forced to use rarely-practiced, paper-based methods for healthcare delivery. In some instances, patient safety is compromised or data exposed to parties seeking profit. This review provides a foundational perspective of the current state of downtime readiness as organizations prepare to handle downtime events. A search of technical news media related to healthcare informatics and a scoping review of the research literature were conducted. Findings ranged from theoretical exploration of downtime to empirical direct comparison of downtime versus normal operation. Overall, 166 US hospitals experienced a total of 701 days of downtime in 43 events between 2012 and 2018. Almost half (48.8%) of the published downtime events involved some form of cyber-attacks. Downtime contingency planning is still predominantly considered through a top-down organizational focus. We propose that a bottom-up approach, involving the front-line clinical staff responsible for executing the downtime procedure, will be beneficial. Significant new research support for the development of contingency plans will be needed.
... Genes et al. 38 After-action reporting of a medical center's response to an unplanned EHR downtime. ...
... Future studies should focus on and utilize the experiential knowledge that most, if not all, hospitals possess from downtimes. This experiential knowledge may be obtained using techniques similar to the after-action reporting analysis from Genes et al. 38 and Little et al., 43 the general historical review from Bulson et al., 41 or the survey data collection from Sittig et al. 39 Continued research into the abstract concept of downtime events without regarding the available concrete experiences misses an opportunity to expand our knowledge of downtime. Similarly, in order to move beyond the accepted belief that downtime events impact safety and performance in general, analyzing records of prior events can identify specific areas of the hospital that are heavily impacted and, therefore, should be the focus for improvement efforts. ...
Preprint
BACKGROUND Electronic Health Record Systems have become ubiquitous in the delivery of patient care. While the implementation has brought safety and efficiency boosts to the industry, it has also exposed patients and their data to new risks in the form of downtime. Downtimes are any period where the computer systems are unavailable and these periods occur for updates or upgrades, but can also be triggered by deliberate cyber-attack. During an unexpected downtime, healthcare workers are forced to fall back to rarely practiced paper-based methods for healthcare delivery, while at the same time, patient data is potentially exposed to parties seeking to profit from its sale. OBJECTIVE We sought to provide a foundational perspective of the current state of downtime readiness in light of the growing cyber-attack threat on healthcare data and hospital networks. METHODS A search of technical news media related to healthcare informatics and a scoping review of research literature were conducted. Following the ENTEREQ framework, 1,651 records were retrieved, of which 16 were included in the final review. RESULTS 164 US-based hospitals experienced a total of 670 days of downtime in 41 events between 2012 and 2018. Almost half (48.8%) of the published downtime events involved some form of cyber-attack. 1,651 studies matching downtime search strings were found, 16 of which were found to meet inclusion criteria. Few research studies have a downtime emphasis; those that do are predominantly focused on a top-down approach. They were found to have a range of focus from the theoretical exploration of downtime to direct empirical comparison of downtime versus normal operation. CONCLUSIONS Downtime contingency planning is still predominantly considered in abstract or top-down organizational focus. It is proposed that a bottom-up approach to comprehending and addressing downtime will be beneficial due to the complicated nature of patient care and computer downtime events. A bottom-up approach would involve the front-line clinical staff responsible for executing the downtime procedure and directly caring for the patients. EHR downtime events will continue to be a complication to hospital and healthcare operations. Significant new research support for the development of contingency plans will be needed as the cyber-attack threat continues to grow.
... nostic and other test results, and patient tracking. Unexpected downtime of electronic health information systems leads to longer operative times and increased time to disposition for patients in the emergency department.79 A growing concern, beyond the scope of this review, are cybersecurity threats to HIT infrastructures, including ransomware. ...
Full-text available
Article
Objective Hospitals are a key component to disaster response but are susceptible to the effects of disasters as well, including infrastructure damage that disrupts patient care. These events offer an opportunity for evaluation and improvement of preparedness and response efforts when hospitals are affected directly by a disaster. The objective of this structured review was to evaluate the existing literature on hospitals as disaster victims. Methods A structured and scoping review of peer-reviewed literature, gray literature, and news reports related to hospitals as disaster victims was completed to identify and analyze themes and lessons observed from disasters in which hospitals are victims, to aid in future emergency operations planning and disaster response. Results The literature search and secondary search of referenes identified 366 records in English. A variety of common barriers to successful disaster response include loss of power, water, heating and ventilation, communications, health information technology, staffing, supplies, safety and security, and structural and non-structural damage. Conclusions There are common weaknesses in disaster preparedness that we can learn from and account for in future planning with the aim of improving resilience in the face of future disasters.
... However, information security policies alone are not sufficient to protect a healthcare organization. They should be accompanied by additional measures such as security awareness training sessions and distributed policy statements [18,19]. ...
Full-text available
Article
Background: Connected medical devices and electronic health records have added important functionality to patient care, but have also introduced a range of cybersecurity concerns. When a healthcare organization suffers from a cybersecurity incident, its incident response strategies are critical to the success of its recovery. Objective: In this article, we identify gaps in research concerning cybersecurity response plans in health care. Through a systematic literature review, we develop aggregated strategies that professionals can use to construct better response strategies in their organizations. Methods: We reviewed journal articles on cyber incident response plans in health care published in PubMed and Web of Science. We sought to collect articles on the intersection of cybersecurity and health care that focused on incident response strategies. Results: We identified and reviewed thirteen articles for cybersecurity response recommendations. We then extracted information such as research methods, findings, and implications. Finally, we synthesized the recommendations into a framework of eight aggregated response strategies (EARS) that fall under managerial and technological categories. A direct comparison of EARS with other frameworks demonstrates the necessity of utilizing EARS in addition to these commonly accepted frameworks. While existing frameworks are undeniably useful, we have identified at least one point for potential improvement in each framework Conclusions: We conducted a systematic review of the literature on cybersecurity response plans in health care and developed a novel framework for response strategies that could be deployed by healthcare organizations. More work is needed to evaluate incident response strategies in health care.
... Despite the undisputed relevance of the measures to avoid risks, experience and current events indicate that IT disasters just happen despite all measure of good prevention [3,37,38]. In such cases, the only help is to be prepared for the worst case scenario [39][40][41]. ...
Article
Introduction: As many medical workflows depend vastly on IT support, great demands are placed on the availability and accuracy of the applications involved. The cases of IT failure through ransomware at the beginning of 2016 are impressive examples of the dependence of clinical processes on IT. Although IT risk management attempts to reduce the risk of IT blackouts, the probability of partial/total data loss, or even worse, data falsification, is not zero. The objective of this paper is to present the state of the art with respect to strategies, processes, and governance to deal with the failure of IT systems. Methods: This article is conducted as a narrative review. Results: Worst case scenarios are needed, dealing with methods as to how to survive the downtime of clinical systems, for example through alternative workflows. These workflows have to be trained regularly. We categorize the most important types of IT system failure, assess the usefulness of classic counter measures, and state that most risk management approaches fall short on exactly this matter. Conclusion: To ensure that continuous, evidence-based improvements to the recommendations for IT emergency concepts are made, it is essential that IT blackouts and IT disasters are reported, analyzed, and critically discussed. This requires changing from a culture of shame and blame to one of error and safety in healthcare IT. This change is finding its way into other disciplines in medicine. In addition, systematically planned and analyzed simulations of IT disaster may assist in IT emergency concept development.
Chapter
Security constraints that enforce security requirements characterize healthcare systems. These constraints have a substantial impact on the resiliency of the final system. Security requirements modelling approaches allow the prevention of cyber incidents; however, the focus to date has been on prevention rather than resiliency. Resiliency extends into the detection, mitigation and recovery after security violations. In this paper, we propose an enhanced at a conceptual level that attempts to align cybersecurity with resiliency. It does so by extending the Secure Tropos cybersecurity modelling language to include resiliency. The proposed conceptual model examines resiliency from three viewpoints, namely the security requirements, the healthcare context and its implementational capability. We present an overview of our conceptual model of a cyber resiliency language and discuss a case study to attest the healthcare context in our approach.
Most healthcare facilities have comprehensive disaster plans, the author says, but he questions whether a cyberattack plan is included. In this article, he outlines the basics of such a plan.
Article
Failure of a clinical laboratory computer system can disrupt work flow and charge capture and affect patient care. The first comprehensive survey of computer downtime was conducted in 1995 and demonstrated significant interinstitutional variation in system availability. Despite numerous changes in the laboratory and computer industries since 1995, no follow-up study has been reported. To quantify current laboratory computer availability and compare it with 1995 performance. Ninety-seven laboratories prospectively recorded the frequency and duration of computer downtime during 30 days in 2001. Results were compared with 1995 survey data. For the median facility, the number of downtime episodes decreased from 8 events per 30 days during 1995 to 3 events per 30 days during 2001 (P <.01). The frequency of unscheduled downtime also improved, from a median of 2 to 0.5 events per 30 days (P <.01). Reduced downtime events were paralleled by reduced cumulative downtime (14.3 vs 4.0 hours per 30 days; P <.01). Improvements were not restricted to the median facility; laboratories performing in the bottom quartile in 2001 recorded substantially less downtime than laboratories in the bottom quartile in 1995. When the comparison was restricted to the 37 institutions that participated in both the 1995 and 2001 surveys, a significant reduction in overall downtime and unscheduled downtime events was still evident (P <.01). More recent installation of vendor software patches was associated with a reduced frequency of downtime events in the 2001 data set. Laboratory computer downtime was less frequent in 2001 than in 1995; industry performance appears to be improving.
Article
Little research has been conducted on the nature and impact of system downtime. Downtime may have a major impact both on patient care and on operating costs. We examined different system components that contributed to downtime of an emergency department information system and characterized the frequency and length of these downtimes during a period of four months.
Article
As computers become embedded in clinical workflow processes, disruptions to access can have serious consequences. The Health Evaluation through Logical Processing system at LDS Hospital is a computerized hospital information system that has been under continuous development for more than 30 years. The system maintains a 99.85% uptime and averages more than 17,000 logons per day. The first formal downtime plan for this system was developed in 1992 in anticipation of a major hardware installation. In early 2000 after a series of planned downtimes from which we did not recover smoothly, our Software Oversight Committee became interested in understanding downtime procedures. A downtime plan for clinical users was developed and tested and is discussed. A March 2000 downtime survey of 103 clinical staff provided additional information to refine the plan. The downtime plan now includes explicit instructions about the clinical data that must be reentered after a downtime and also includes a plan for a regularly scheduled downtime practice drill similar to a fire drill.
Department of Emer-gency Medicine, Icahn School of Medicine at Mount Sinai
  • Kevin W Chason
  • Do
  • Assistant
  • Professor
Kevin W. Chason, DO, Assistant Professor, Department of Emer-gency Medicine, Icahn School of Medicine at Mount Sinai, New York, New York. R Re ef fe er re en nc ce es s
Adoption of Electronic Health Record Systems among U.S. Non-federal Acute Care Hospitals: 2008-2012. 2013. ONC Data Brief
  • Charles D J King
Charles D, King J, Patel V, et al.: Adoption of Electronic Health Record Systems among U.S. Non-federal Acute Care Hospitals: 2008-2012. 2013. ONC Data Brief. Available at http://www.healthit. gov/sites/default/files/oncdatabrief9final.pdf. Accessed August 1, 2013
Getting to meaningful use. Paper presented at Annual Meeting of Society for Academic Emergency Medicine
  • Wise
Wise P: Getting to meaningful use. Paper presented at Annual Meeting of Society for Academic Emergency Medicine, Phoenix, AZ, June 3-6, 2010.
The toll of downtime
  • Anderson
Anderson M: The toll of downtime. Healthc Inform. 2002; 19(4): 27-30.
Halamka on Beth Israel's health-care IT disaster. CIO
  • Beratino
Beratino S: Halamka on Beth Israel's health-care IT disaster. CIO. 2003. Available at http://www.cio.com/article/31701/Halamka_ on_Beth_Israel_s_Health_Care_IT_Disaster. Accessed June 1, 2013.