ArticlePDF Available

The Price of Artificial Intelligence



Introduction: Whilst general artificial intelligence (AI) is yet to appear, today’s narrow AI is already good enough to transform much of healthcare over the next two decades. Objective: There is much discussion of the potential benefits of AI in healthcare and this paper reviews the cost that may need to be paid for these benefits, including changes in the way healthcare is practiced, patients are engaged, medical records are created, and work is reimbursed. Results: Whilst AI will be applied to classic pattern recognition tasks like diagnosis or treatment recommendation, it is likely to be as disruptive to clinical work as it is to care delivery. Digital scribe systems that use AI to automatically create electronic health records promise great efficiency for clinicians but may lead to potentially very different types of clinical records and workflows. In disciplines like radiology, AI is likely to see image interpretation become an automated process with diminishing human engagement. Primary care is also being disrupted by AI-enabled services that automate triage, along with services such as telemedical consultations. This altered future may necessarily see an economic change where clinicians are increasingly reimbursed for value, and AI is reimbursed at a much lower cost for volume. Conclusion: AI is likely to be associated with some of the biggest changes we will see in healthcare in our lifetime. To fully engage with this change brings promise of the greatest reward. To not engage is to pay the highest price.
IMIA Yearbook of Medical Informatics 2019
© 2019 IMIA and Georg Thieme Verlag KG
The Price of Artificial Intelligence
Enrico Coiera
Australian Institute of Health Innovation, Macquarie University, Sydney, NSW, Australia
We are not ready for what is about to come.
It is not that healthcare will be soon run
by a web of artificial intelligences (AIs) that
are smarter than humans. Such general AI
does not appear anywhere near the horizon.
Rather, the narrow AI that we already have,
with all its flaws and limitations, is already
good enough to transform much of what we
do, if applied carefully.
Amara’s Law tells us that we tend to
overestimate the impact of a technology in
the short run, but underestimate its impact
in the long [1]. There is no doubt that AI has
gone through another boom cycle of inflated
expectations, and that some will be disap-
pointed that promised breakthroughs have
not materialized. Yet, despite this, the next
decade will see a steadily growing stream
of AI applications across healthcare. Many
of these applications may initially be niche,
but eventually they will become mainstream.
Eventually they will lead to substantial
change in the business of healthcare. In
twenty years time, there is every prospect
the changes we find will be transformational.
Such transformation however comes with
a price. For all the benefits that will come
through improved efficiency, safety, and
clinical outcomes, there will be costs [2]. The
nature of change is that it often seems to appear
suddenly. While we are all daily distracted try-
ing to make our unyielding health system bend
to our needs using traditional approaches,
disruptive change surprises because it comes
from places we least expected, and in ways we
never quite imagined.
In linguistics, the Whorf hypothesis says
that we can only imagine what we can speak
of [3]. Our cognition is limited by the concepts
we have words for. It is much the same in the
world of health informatics. We have devel-
oped strict conceptual structures that corral AI
into solving classic pattern recognition tasks
like diagnosis or treatment recommendation.
We think of AI automating image interpreta-
tion, or sifting electronic health record data
for personalized treatment recommendations.
Most don’t often think about AI automating
foundational business processes. Yet AI is
likely to be more disruptive to clinical work
in the short run than it will be to care delivery.
Digital scribes, for example, will steadily
take on more of the clinical documentation task
[4]. Scribes are digital assistants that listen to
clinical talk such as patient consultations. They
may undertake a range of tasks from simple
transcription through to the summarization of
key speech elements into the electronic record,
as well as providing information retrieval and
question-answering services. The promise of
digital scribes is a reduction in human docu-
mentation burden. The price for this help will
be a re-engineering of the clinical encounter.
The technology to recognize and interpret
clinical speech from multiple speakers, and
to transform that speech into accurate clinical
summaries is not yet here. However, if humans
are willing to change how they speak, for
example by giving an AI commands and hints,
then much can be done today. It is easier for
a human to say “Scribe, I’d like to prescribe
some medication” than for the AI to be trained
to accurately recognize whether the speech it
is listening to is past history, present history,
or prescription talk.
The price for using a scribe might also be
an even more obvious intrusion of technol-
ogy between patient and clinician, and new
risks to patient privacy because speech data
contains even more private information than
clinician-generated records. Clinicians might
simply replace today’s effort in creating
records, where they have control over con-
tent, to new work in reviewing and editing
automated records, where content reflects
the design of the AI. There are also subtler
risks. Automation bias might mean that many
clinicians cease to worry about what should
go into a clinical document, and simply
accept whatever a machine has generated
Introduction: Whilst general artificial intelligence (AI) is yet to
appear, today’s narrow AI is already good enough to transform
much of healthcare over the next two decades.
Objective: There is much discussion of the potential benefits
of AI in healthcare and this paper reviews the cost that may
need to be paid for these benefits, including changes in the way
healthcare is practiced, patients are engaged, medical records are
created, and work is reimbursed.
Results: Whilst AI will be applied to classic pattern recognition
tasks like diagnosis or treatment recommendation, it is likely to
be as disruptive to clinical work as it is to care delivery. Digital
scribe systems that use AI to automatically create electronic
health records promise great efficiency for clinicians but may lead
to potentially very different types of clinical records and work-
flows. In disciplines like radiology, AI is likely to see image inter-
pretation become an automated process with diminishing human
engagement. Primary care is also being disrupted by AI-enabled
services that automate triage, along with services such as tele-
medical consultations. This altered future may necessarily see an
economic change where clinicians are increasingly reimbursed for
value, and AI is reimbursed at a much lower cost for volume.
Conclusion: AI is likely to be associated with some of the biggest
changes we will see in healthcare in our lifetime. To fully engage
with this change brings promise of the greatest reward. To not
engage is to pay the highest price.
Artificial intelligence, electronic health record, radiology, primary
care, value-based care
Yearb Med Inform 2019:14-5
Published online: 25.04.2019
IMIA Yearbook of Medical Informatics 2019
The Price of Artificial Intelligence
[5]. Given the widespread use of copy and
paste in current day electronic records [6],
such an outcome seems a distinct possibility.
At this moment, narrow AI, predomi-
nately in the form of deep learning, is making
great inroads into pattern recognition tasks
such as diagnostic radiological image inter-
pretation [7]. The sheer volume of training
data now available, along with access to
cheap computational resources, has allowed
previously impractical neural network archi-
tectures to come into their own. When a price
for deep learning is discussed, it is often in
terms of the end of clinical professions such
as radiology or dermatology [8]. Human
expertise is to be rendered redundant by
super-human automation.
The reality is much more nuanced. Firstly,
there remain great challenges to generalizing
narrow AI methods. A well-trained deep
network typically does better on data sets
that resemble its training population [9]. The
appearance of unexpected new edge cases,
or implicit learning of features such as clin-
ical workflow or image quality [10], can all
degrade performance. One remedy for this
limitation is transfer learning [11], retraining
an algorithm on new data taken from the
local context in which it will operate. So, just
as we have seen with electronic records, the
prospect of cheap and generalizable technol-
ogy might be a fantasy, and expensive system
localization and optimization may become
the lived AI reality.
Secondly, the radiological community
has reacted early, and proactively, to these
challenges. Rather than resisting change,
there is strong evidence not just that AI is
being actively embraced within the world
of radiology, but also that there is an under-
standing that change brings not just risks, but
opportunities. In the future, radiologists might
be freed from working in darkened reading
rooms, and emerge to become highly visible
participants to clinical care. Indeed, in the
future, the idea of being an expert in just a
single modality such as image interpretation
may seem quaint, as radiologists transform
into diagnostic experts, integrating data from
multiple modalities from the genetic through
to the radiologic.
The highly interconnected nature of
healthcare means that changes in one part
of the system will require different changes
elsewhere. Radiologists in many parts of the
world are paid for each image they read. With
the arrival of cheap bulk AI image interpre-
tation, that payment model must change. The
price of reading must surely drop, and expert
humans must instead be paid for the value
they create, not the volume they process.
The same kind of business pressure is
being felt in other clinical specialties. In
primary care, for example, the arrival of
new, sometimes aggressive, players who base
their business model on AI patient triage and
telemedicine is already problematic [12, 13].
Patients might love the convenience of such
services, especially when they are technolog-
ically literate, young, and in good health, but
they may not always be so well served if they
are older, or have complex comorbidities [14].
Thus, AI-based primary care services might
end up caring for profitable low-cost and low-
risk patients, and leave the remainder to be
managed by a financially diminished existing
primary care system. One remedy to such a
risk is again to move away from reimburse-
ment for volume, to reimbursement for value.
Indeed, value-based healthcare might arrive
not as the product of government policy, but
as a necessary side effect of AI automation.
There are thus early lessons in the different
reactions to AI between primary care and
radiology. One sector is being caught by sur-
prise and playing catch up to new commercial
realities that have come more quickly than
expected; the other has begun to reimagine
itself in anticipation of becoming the ones
that craft the new reality. The price each
sector pays is different. Proactive preparation
requires investment in reshaping workforce,
and actively engaging with industry, con-
sumers, and government. It requires serious
consideration of new safety and ethical risks
[15]. In contrast, reactive resistance takes a toll
on clinical professionals who rightly wish to
defend their patients’ interests, as much as their
own right to have a stake in them. Unexpected
change may end up eroding or even destroying
important parts of the existing health system
before there is a chance to modernize them.
So, the fate of medicine, and indeed for
all of healthcare, is to change [15]. As change
makers go, AI is likely to be among the
biggest we will see in our time. Its tendrils
will touch everything from basic biomedical
discovery science through the way we each
make our daily personal health decisions. For
such change we must expect to pay a price.
What is paid, by whom, and who benefits, all
depend very much on how we engage with
this profound act of reinvention. To fully
engage brings promise of the greatest reward.
To not engage is to pay the highest price.
1. Roy Amara 1925–2007, American futurologist. In:
Ratcliffe S, editor. Oxford Essential Quotations.
4th ed; 2016.
2. Schwartz WB. Medicine and the Computer. The
Promise and Problems of Change. N Engl J Med
3. Kay P, Kempton W. What is the Sapir-Whorf
hypothesis? Am Anthropol 1984;86(1):65-79.
4. Coiera E, Kocaballi B, Halamka J, Laranjo L. The
digital scribe. NPJ Digit Med 2018;1:58.
5. Lyell D, Coiera E. Automation bias and verification
complexity: a systematic review. J Am Med Inform
Assoc 2017;24(2):423-31.
6. Siegler EL, Adelman R. Copy and paste: a reme-
diable hazard of electronic health records. Am J
Med 2009 Jun;122(6):495-96.
7. Litjens G, Kooi T, Bejnordi BE, Setio AAA,
Ciompi F, Ghafoorian M, et al. A survey on deep
learning in medical image analysis. Med Image
Anal 2017 Dec;42:60-88.
8. Darcy AM, Louie AK, Roberts LW. Machine
learning and the profession of medicine. JAMA
9. Chen JH, Asch SM. Machine Learning and Prediction
in Medicine - Beyond the Peak of Inflated Expecta-
tions. New Engl J Med 2017;376(26):2507-09.
10. Zech JR, Badgeley MA, Liu M, Costa AB,
Titano JJ, Oermann EK. Variable generalization
performance of a deep learning model to detect
pneumonia in chest radiographs: A cross-sectional
study. PLoS Med 2018 Nov 6;15(11):e1002683.
11. Pan SJ, Yang Q. A survey on transfer learning. IEEE
Trans Knowl Data Eng 2010;22(10):1345-59.
12. McCartney M. General practice can’t just exclude
sick people. BMJ 2017;359:j5190.
13. Fraser H, Coiera E, Wong D. Safety of patient-fac-
ing digital symptom checkers. Lancet 2018 Nov
14. Marshall M, Shah R, Stokes-Lampard H. Online
consulting in general practice: making the move
from disruptive innovation to mainstream service.
BMJ 2018 Mar 26;360:k1195.
15. Coiera E. The fate of medicine in the time of AI.
Lancet 2018;392(10162):2331-2.
Correspondence to:
Enrico Coiera
Australian Institute of Health Innovation
Macquarie University
Level 6 75 Talavera Rd
Sydney, NSW 2109, Australia
... Despite these challenges, AI can still constitute a highly impactful technology. Scholars have therefore recommended involving pathologists in development, implementation, and governance processes in order to optimize this impact 2,6,11 . Previous empirical studiestwo surveys and two qualitative interview studieshave focused on exploring the views of pathologists concerning AI [12][13][14][15] . ...
... On paper, it is therefore mostly a 'simple' question of opportunity: When are the circumstances right for AI to be implemented in pathology? In reality, this question proves harder to answer; technical as well as ethical challenges to AI implementation have been formulated and require a clear strategy to tackle them 8,11,31 . Specifically, it requires members of pathology labs, with their extensive knowledge on practices, roles and responsibilities within pathology, to reflect on the way in which AI can and should be implemented in diagnostic process. ...
Full-text available
Recent progress in the development of artificial intelligence (AI) has sparked enthusiasm for its potential use in pathology. As pathology labs are currently starting to shift their focus towards AI implementation, a better understanding how AI tools can be optimally aligned with the medical and social context of pathology daily practice is urgently needed. Strikingly, studies often fail to mention the ways in which AI tools should be integrated in the decision-making processes of pathologists, nor do they address how this can be achieved in an ethically sound way. Moreover, the perspectives of pathologists and other professionals within pathology concerning the integration of AI within pathology remains an underreported topic. This article aims to fill this gap in the literature and presents the first in-depth interview study in which professionals’ perspectives on the possibilities, conditions and prerequisites of AI integration in pathology are explicated. The results of this study have led to the formulation of three concrete recommendations to support AI integration, namely: (1) foster a pragmatic attitude toward AI development, (2) provide task-sensitive information and training to health care professionals working in pathology departments and (3) take time to reflect upon users’ changing roles and responsibilities.
... Digital scribes have the potential to ease the strain of manual documentation. The redesign of the clinical encounter will be the cost of this assistance [32]. But India now has a space to enhance smart humanistic services and patient satisfaction as described in [34]. ...
... Despite commercial market approval of multiple AI products, there are very few examples of insurance reimbursement for AI. In order to establish added value for government and insurance agencies, larger clinical trials and real-life observational studies are required to demonstrate how the information is actually used by clinicians and how it impacts patient outcomes [65][66][67][68][69][70]. ...
Full-text available
Over the past decade, there has been a dramatic rise in the interest relating to the application of artificial intelligence (AI) in radiology. Originally only ‘narrow’ AI tasks were possible; however, with increasing availability of data, teamed with ease of access to powerful computer processing capabilities, we are becoming more able to generate complex and nuanced prediction models and elaborate solutions for healthcare. Nevertheless, these AI models are not without their failings, and sometimes the intended use for these solutions may not lead to predictable impacts for patients, society or those working within the healthcare profession. In this article, we provide an overview of the latest opinions regarding AI ethics, bias, limitations, challenges and considerations that we should all contemplate in this exciting and expanding field, with a special attention to how this applies to the unique aspects of a paediatric population. By embracing AI technology and fostering a multidisciplinary approach, it is hoped that we can harness the power AI brings whilst minimising harm and ensuring a beneficial impact on radiology practice.
... Another troublesome requirement is that in order to achieve high prediction accuracy, ergo maximum effectiveness, AI models require large amounts of curated and labeled patient data for training [17]. This ensures that, under all circumstances, they will be able to handle the complexity of comorbidities that are frequently seen in the population [19,20]. It is essential for the future welfare of healthcare systems, and medical professionals to evaluate the patients' condition not only from the results of the digital triage but also from the patients' clinical condition. ...
Full-text available
Purpose: In the Emergency Departments (ED) the current triage systems that are been implemented are based completely on medical education and the perception of each health professional who is in charge. On the other hand, cutting-edge technology, Artificial Intelligence (AI) can be incorporated into healthcare systems, supporting the healthcare professionals' decisions, and augmenting the performance of triage systems. The aim of the study is to investigate the efficiency of AI to support triage in ED. Patients–methods: The study included 332 patients from whom 23 different variables related to their condition were collected. From the processing of patient data for input variables, it emerged that the average age was 56.4 ± 21.1 years and 50.6% were male. The waiting time had an average of 59.7 ± 56.3 minutes while 3.9% ± 0.1% entered the Intensive Care Unit (ICU). In addition, qualitative variables related to the patient's history and admission clinics were used. As target variables were taken the days of stay in the hospital, which were on average 1.8 ± 5.9, and the Emergency Severity Index (ESI) for which the following distribution applies: ESI: 1, patients: 2; ESI: 2, patients: 18; ESI: 3, patients: 197; ESI: 4, patients: 73; ESI: 5, patients: 42. Results: To create an automatic patient screening classifier, a neural network was developed, which was trained based on the data, so that it could predict each patient's ESI based on input variables.The classifier achieved an overall accuracy (F1 score) of 72.2% even though there was an imbalance in the classes. Conclusions: The creation and implementation of an AI model for the automatic prediction of ESI, highlighted the possibility of systems capable of supporting healthcare professionals in the decision-making process. The accuracy of the classifier has not reached satisfactory levels of certainty, however, the performance of similar models can increase sharply with the collection of more data.
... • were not triage related (n = 2) [40,41]. ...
Full-text available
Introduction Patient-operated digital triage systems with AI components are becoming increasingly common. However, previous reviews have found a limited amount of research on such systems’ accuracy. This systematic review of the literature aimed to identify the main challenges in determining the accuracy of patient-operated digital AI-based triage systems. Methods A systematic review was designed and conducted in accordance with PRISMA guidelines in October 2021 using PubMed, Scopus and Web of Science. Articles were included if they assessed the accuracy of a patient-operated digital triage system that had an AI-component and could triage a general primary care population. Limitations and other pertinent data were extracted, synthesized and analysed. Risk of bias was not analysed as this review studied the included articles’ limitations (rather than results). Results were synthesized qualitatively using a thematic analysis. Results The search generated 76 articles and following exclusion 8 articles (6 primary articles and 2 reviews) were included in the analysis. Articles’ limitations were synthesized into three groups: epistemological, ontological and methodological limitations. Limitations varied with regards to intractability and the level to which they can be addressed through methodological choices. Certain methodological limitations related to testing triage systems using vignettes can be addressed through methodological adjustments, whereas epistemological and ontological limitations require that readers of such studies appraise the studies with limitations in mind. Discussion The reviewed literature highlights recurring limitations and challenges in studying the accuracy of patient-operated digital triage systems with AI components. Some of these challenges can be addressed through methodology whereas others are intrinsic to the area of inquiry and involve unavoidable trade-offs. Future studies should take these limitations in consideration in order to better address the current knowledge gaps in the literature.
... The price for this help will be a re-engineering of the clinical encounter. 52 Unconstrained clinical conversation between patient and doctor is non-linear, with the appearance of new information (e.g., a new clinical symptom or finding) triggering a re-exploration of a previously completed task such as an enquiry about family history of disease. 53 While a fully automated method to transform conversation into complete and accurate clinical records in such a dynamic setting is beyond the state of the art, it is possible to use AI methods to undertake subtasks in this process and still meaningfully reduce clinician documentation effort. ...
Full-text available
Healthcare has well-known challenges with safety, quality, and effectiveness, and many see artificial intelligence (AI) as essential to any solution. Emerging applications include the automated synthesis of best-practice research evidence including systematic reviews, which would ultimately see all clinical trial data published in a computational form for immediate synthesis. Digital scribes embed themselves in the process of care to detect, record, and summarize events and conversations for the electronic record. However, three persistent translational challenges must be addressed before AI is widely deployed. First, little effort is spent replicating AI trials, exposing patients to risks of methodological error and biases. Next, there is little reporting of patient harms from trials. Finally, AI built using machine learning may perform less effectively in different clinical settings.
... 31 Dehumanisation and biomedicalisation Dehumanisation was discussed in 19 publications. As Coiera 32 states, more biomedicalised healthcare system may have adverse impacts on patients with complex needs, who disproportionally are from socioeconomically disadvantaged groups and of a minority ethnicity. The only included empirical study on impacted populations was by Miller et al, 33 who surveyed users of a primary-care-triage AI-driven chatbot. ...
Full-text available
Objective Artificial intelligence (AI) will have a significant impact on healthcare over the coming decade. At the same time, health inequity remains one of the biggest challenges. Primary care is both a driver and a mitigator of health inequities and with AI gaining traction in primary care, there is a need for a holistic understanding of how AI affect health inequities, through the act of providing care and through potential system effects. This paper presents a systematic scoping review of the ways AI implementation in primary care may impact health inequity. Design Following a systematic scoping review approach, we searched for literature related to AI, health inequity, and implementation challenges of AI in primary care. In addition, articles from primary exploratory searches were added, and through reference screening. The results were thematically summarised and used to produce both a narrative and conceptual model for the mechanisms by which social determinants of health and AI in primary care could interact to either improve or worsen health inequities. Two public advisors were involved in the review process. Eligibility criteria Peer-reviewed publications and grey literature in English and Scandinavian languages. Information sources PubMed, SCOPUS and JSTOR. Results A total of 1529 publications were identified, of which 86 met the inclusion criteria. The findings were summarised under six different domains, covering both positive and negative effects: (1) access, (2) trust, (3) dehumanisation, (4) agency for self-care, (5) algorithmic bias and (6) external effects. The five first domains cover aspects of the interface between the patient and the primary care system, while the last domain covers care system-wide and societal effects of AI in primary care. A graphical model has been produced to illustrate this. Community involvement throughout the whole process of designing and implementing of AI in primary care was a common suggestion to mitigate the potential negative effects of AI. Conclusion AI has the potential to affect health inequities through a multitude of ways, both directly in the patient consultation and through transformative system effects. This review summarises these effects from a system tive and provides a base for future research into responsible implementation.
... 53 Moreover, although several studies have analysed the cost-effectiveness of AI over classic management in various pathologies, 54,55 to date no reports have analysed the cost of AI for a biological diagnostic test in the specific context of endometriosis. 56 The present study demonstrates the value of Endotest® from an economic perspective in the French context: when Endotest® was priced at €500 or €750, the strategies integrating the test were dominant over the current French diagnostic algorithm. This did not hold true when Endotest® was priced at €1000, although the ICER remained acceptable. ...
Full-text available
Objective: To evaluate a saliva diagnostic test (Endotest®) for endometriosis compared with the conventional algorithm. Design: A cost-effectiveness analysis with a decision-tree model based on literature data SETTING: France. Population: Women with chronic pelvic pain. Methods: Strategy I represents the French algorithm as the comparator. For the strategy II, all the patients have an Endotest®. For the strategy III, the patients undergo ultrasonography to detect endometrioma and those without have an Endotest®. For the strategy IV, patients with no endometrioma detected on ultrasonography, undergo a pelvic MRI to detect endometrioma and/or deep endometriosis. An Endotest® is then performed for patients with a negative MRI. Main outcomes measures: Costs and accuracy rates and ICERs. Three analyses were performed with an Endotest® priced at 500€, 750€, and 1,000€. Probabilistic sensitivity analysis was conducted with Monte Carlo simulations. Results: With an Endotest® priced at 750€, the cost per correctly diagnosed case was 1,542€, 990€, 919€, and 1,000€, respectively, for strategy I, II, III and IV. Strategy I was dominated by all other strategies. The strategies IV, III and II were respectively preferred for a willingness-to-pay threshold below 473€, between 473€ and 4,670€ and beyond 4,670€ per correctly diagnosed case. At a price of 500€ per Endotest®, strategy I was dominated by all other strategies. At 1,000€, the ICER of strategies II and III were 724€ and 387€ per correctly diagnosed case, respectively, compared with strategy I. Conclusion: The present study demonstrates the value of the Endotest® from an economic perspective.
... Machine learning in breast surgery may involve these sets of models and methods to detect patterns in vast amounts of patient data, extract appropriate information, and use it to perform decision-making under uncertain conditions 9 . The potential applications of machine learning are significant, and breast surgeons must strive to be informed with up-to-date knowledge and applications of this subset of AI within their speciality 10,11 . ...
Full-text available
Background: Machine learning is a set of models and methods that can automatically detect patterns in vast amounts of data, extract information, and use it to perform decision-making under uncertain conditions. The potential of machine learning is significant, and breast surgeons must strive to be informed with up-to-date knowledge and its applications. Methods: A systematic database search of Embase, MEDLINE, the Cochrane database, and Google Scholar, from inception to December 2021, was conducted of original articles that explored the use of machine learning and/or artificial intelligence in breast surgery in EMBASE, MEDLINE, Cochrane database and Google Scholar. Results: The search yielded 477 articles, of which 14 studies were included in this review, featuring 73 847 patients. Four main areas of machine learning application were identified: predictive modelling of surgical outcomes; breast imaging-based context; screening and triaging of patients with breast cancer; and as network utility for detection. There is evident value of machine learning in preoperative planning and in providing information for surgery both in a cancer and an aesthetic context. Machine learning outperformed traditional statistical modelling in all studies for predicting mortality, morbidity, and quality of life outcomes. Machine learning patterns and associations could support planning, anatomical visualization, and surgical navigation. Conclusion: Machine learning demonstrated promising applications for improving breast surgery outcomes and patient-centred care. Neveretheless, there remain important limitations and ethical concerns relating to implementing artificial intelligence into everyday surgical practices.
This paper describes the successful collaboration ‘in the wild’ between Clinical Documentation Integrity Specialists (CDIS) and an Artificial Intelligence (AI)-embedded software to conduct knowledge work. CDIS review patient charts in near real time to improve clinicians’ documentation, with the goal to make medical documentation more accurate, consistent and complete. CDIS collaborate with an AI-embedded “Computer Assisted Coding” (CAC) system that scans records from the Electronic Healthcare Record and auto-suggests codes based on natural language processing. CDIS find the CAC's suggestions are often inaccurate—often humorously so. Still, they find the CAC to be a useful helper, like Robin is to Batman. This human-AI collaboration is contingent on several factors: the flexible integration of the AI into the workflow similar to the notion of unremarkable AI; supporting the CDIS’ sensemaking; the CDIS’ knowledge about the CAC being predictably unreliable, an experience by the CDIS of the AI's value; humans remaining in control; and ability to experiment with the AI, which spurs reflection and learning for these knowledge workers.
Full-text available
Mis-diagnosis by physicians is a common problem affecting 5% of outpatients. There is a growth in interest in computerised diagnostic decision support systems for physicians, and increasingly for direct use by patients on mobile phones, termed Symptom Checkers(SC). These have the potential to improve the way in which health care is delivered and reduce the burden on GP services. However claims have been made that SC from Babylon Health is more accurate at diagnosis than physicians. Evaluations to date have primarily been conducted in controlled environments using clinician-generated scenarios, and surrogate outcomes such as diagnostic performance in lieu of clinical outcomes. Such results are unlikely to reflect real-world use and can be unrealistically optimistic. Patients use risks missing important diagnoses and/or may increasing the burden on the health system. To avoid this, we advocate the use of multi-stage evaluation, building on many years of experience in health informatics and reflecting best practice in other areas of medicine.
Full-text available
BACKGROUND:There is interest in using convolutional neural networks (CNNs) to analyze medical imaging to provide computer-aided diagnosis (CAD). Recent work has suggested that image classification CNNs may not generalize to new data as well as previously believed. We assessed how well CNNs generalized across three hospital systems for a simulated pneumonia screening task. METHODS AND FINDINGS:A cross-sectional design with multiple model training cohorts was used to evaluate model generalizability to external sites using split-sample validation. A total of 158,323 chest radiographs were drawn from three institutions: National Institutes of Health Clinical Center (NIH; 112,120 from 30,805 patients), Mount Sinai Hospital (MSH; 42,396 from 12,904 patients), and Indiana University Network for Patient Care (IU; 3,807 from 3,683 patients). These patient populations had an age mean (SD) of 46.9 years (16.6), 63.2 years (16.5), and 49.6 years (17) with a female percentage of 43.5%, 44.8%, and 57.3%, respectively. We assessed individual models using the area under the receiver operating characteristic curve (AUC) for radiographic findings consistent with pneumonia and compared performance on different test sets with DeLong's test. The prevalence of pneumonia was high enough at MSH (34.2%) relative to NIH and IU (1.2% and 1.0%) that merely sorting by hospital system achieved an AUC of 0.861 (95% CI 0.855-0.866) on the joint MSH-NIH dataset. Models trained on data from either NIH or MSH had equivalent performance on IU (P values 0.580 and 0.273, respectively) and inferior performance on data from each other relative to an internal test set (i.e., new data from within the hospital system used for training data; P values both
Full-text available
Current generation electronic health records suffer a number of problems that make them inefficient and associated with poor clinical satisfaction. Digital scribes or intelligent documentation support systems, take advantage of advances in speech recognition, natural language processing and artificial intelligence, to automate the clinical documentation task currently conducted by humans. Whilst in their infancy, digital scribes are likely to evolve through three broad stages. Human led systems task clinicians with creating documentation, but provide tools to make the task simpler and more effective, for example with dictation support, semantic checking and templates. Mixed-initiative systems are delegated part of the documentation task, converting the conversations in a clinical encounter into summaries suitable for the electronic record. Computer-led systems are delegated full control of documentation and only request human interaction when exceptions are encountered. Intelligent clinical environments permit such augmented clinical encounters to occur in a fully digitised space where the environment becomes the computer. Data from clinical instruments can be automatically transmitted, interpreted using AI and entered directly into the record. Digital scribes raise many issues for clinical practice, including new patient safety risks. Automation bias may see clinicians automatically accept scribe documents without checking. The electronic record also shifts from a human created summary of events to potentially a full audio, video and sensor record of the clinical encounter. Digital scribes promisingly offer a gateway into the clinical workflow for more advanced support for diagnostic, prognostic and therapeutic tasks.
Full-text available
Big data, we have all heard, promise to transform health care. But in the “hype cycle” of emerging technologies, machine learning now rides atop the “peak of inflated expectations,” and we need to better appreciate the technology’s capabilities and limitations.
Full-text available
Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.
Introduction: While potentially reducing decision errors, decision support systems can introduce new types of errors. Automation bias (AB) happens when users become overreliant on decision support, which reduces vigilance in information seeking and processing. Most research originates from the human factors literature, where the prevailing view is that AB occurs only in multitasking environments. Objectives: This review seeks to compare the human factors and health care literature, focusing on the apparent association of AB with multitasking and task complexity. Data sources: EMBASE, Medline, Compendex, Inspec, IEEE Xplore, Scopus, Web of Science, PsycINFO, and Business Source Premiere from 1983 to 2015. Study selection: Evaluation studies where task execution was assisted by automation and resulted in errors were included. Participants needed to be able to verify automation correctness and perform the task manually. Methods: Tasks were identified and grouped. Task and automation type and presence of multitasking were noted. Each task was rated for its verification complexity. Results: Of 890 papers identified, 40 met the inclusion criteria; 6 were in health care. Contrary to the prevailing human factors view, AB was found in single tasks, typically involving diagnosis rather than monitoring, and with high verification complexity. Limitations: The literature is fragmented, with large discrepancies in how AB is reported. Few studies reported the statistical significance of AB compared to a control condition. Conclusion: AB appears to be associated with the degree of cognitive load experienced in decision tasks, and appears to not be uniquely associated with multitasking. Strategies to minimize AB might focus on cognitive load reduction.
This Viewpoint discusses the opportunities and ethical implications of using machine learning technologies, which can rapidly collect and learn from large amounts of personal data, to provide individalized patient care.Must a physician be human? A new computer, “Ellie,” developed at the Institute for Creative Technologies, asks questions as a clinician might, such as “How easy is it for you to get a good night’s sleep?” Ellie then analyzes the patient’s verbal responses, facial expressions, and vocal intonations, possibly detecting signs of posttraumatic stress disorder, depression, or other medical conditions. In a randomized study, 239 probands were told that Ellie was “controlled by a human” or “a computer program.” Those believing the latter revealed more personal material to Ellie, based on blind ratings and self-reports.1 In China, millions of people turn to Microsoft’s chatbot, “Xiaoice,”2 when they need a “sympathetic ear,” despite knowing that Xiaoice is not human. Xiaoice develops a specially attuned personality and sense of humor by methodically mining the Internet for real text conversations. Xiaoice also learns about users from their reactions over time and becomes sensitive to their emotions, modifying responses accordingly, all without human instruction. Ellie and Xiaoice are the result of machine learning technology.