ArticlePDF Available

Abstract and Figures

Artificial intelligence (AI) uses data and algorithms to aim to draw conclusions as good as humans (or even better). AI is already a part of our daily life – it is behind face recognition, speech recognition in virtual assistants (like Amazon Alexa, Apple's Siri, Google Assistant, and Microsoft Cortana) and self‐driving cars. AI software has been able to win world champions in Chess, Go and recently even Poker. Relevant to our community, it is a prominent source of innovation in healthcare, already helping to develop new drugs, support clinical decisions, and provide quality assurance in radiology. The full list of medical image analysis AI applications with US Food and Drug Administration (FDA) or European Union regulation (soon to fall under European Union Medical Device Regulation (EU‐MDR)) is growing rapidly and covers diverse clinical needs, such as arrhythmia detection with your smartwatch or automatic triage of critical imaging studies to the top of the radiologist worklist. Deep learning, a leading tool of AI, is in particular good at image pattern recognition and therefore of high benefit to doctors who heavily depend on images, like sonologists, radiographers and pathologists. Although obstetric and gynecologic ultrasound are two of the most commonly performed imaging studies, AI has had little impact on this field so far. Nevertheless, there is huge potential to assist in repetitive ultrasound tasks, such as automatically identifying good acquisitions and immediate quality assure. For this potential to thrive interdisciplinary communication between AI developers and ultrasound professionals is necessary. In this opinion we explore the fundamentals of medical imaging AI, from theory to applicability, and introduce some key terms to medical professionals in the field of ultrasound. We believe that wider knowledge of AI will help accelerate its integration into healthcare. This article is protected by copyright. All rights reserved.
Content may be subject to copyright.
Ultrasound Obstet Gynecol 2020; 56: 498–505
Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/uog.22122.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and
reproduction in any medium, provided the original work is properly cited.
State-of-the-Art Review
Introduction to artificial intelligence
in ultrasound imaging in obstetrics
and gynecology
L. DRUKKER1,J.A.NOBLE
2
and A. T. PAPAGEORGHIOU1*
1Nuffield Department of Women’s & Reproductive Health,
University of Oxford, John Radcliffe Hospital, Oxford, UK;
2Institute of Biomedical Engineering, University of Oxford,
Oxford, UK
*Correspondence. (e-mail: aris.papageorghiou@wrh.ox.ac.uk)
ABSTRACT
Artificial intelligence (AI) uses data and algorithms to
aim to draw conclusions that are as good as, or even
better than, those drawn by humans. AI is already
part of our daily life; it is behind face recognition
technology, speech recognition in virtual assistants (such
as Amazon Alexa, Apple’s Siri, Google Assistant and
Microsoft Cortana) and self-driving cars. AI software has
been able to beat world champions in chess, Go and
recently even Poker. Relevant to our community, it is
a prominent source of innovation in healthcare, already
helping to develop new drugs, support clinical decisions
and provide quality assurance in radiology. The list of
medical image-analysis AI applications with USA Food
and Drug Administration or European Union (soon to
fall under European Union Medical Device Regulation)
approval is growing rapidly and covers diverse clinical
needs, such as detection of arrhythmia using a smartwatch
or automatic triage of critical imaging studies to the top
of the radiologist’s worklist. Deep learning, a leading
tool of AI, performs particularly well in image pattern
recognition and, therefore, can be of great benefit to
doctors who rely heavily on images, such as sonologists,
radiographers and pathologists. Although obstetric and
gynecological ultrasound are two of the most commonly
performed imaging studies, AI has had little impact on
this field so far. Nevertheless, there is huge potential
for AI to assist in repetitive ultrasound tasks, such as
automatically identifying good-quality acquisitions and
providing instant quality assurance. For this potential
to thrive, interdisciplinary communication between AI
developers and ultrasound professionals is necessary. In
this article, we explore the fundamentals of medical
imaging AI, from theory to applicability, and introduce
some key terms to medical professionals in the field of
ultrasound. We believe that wider knowledge of AI will
help accelerate its integration into healthcare. ©2020
The Authors. Ultrasound in Obstetrics & Gynecology
published by John Wiley & Sons Ltd on behalf of the
International Society of Ultrasound in Obstetrics and
Gynecology.
Introduction
Artificial intelligence (AI) is described as the ability of a
computer program to perform processes associated with
human intelligence, such as reasoning, learning, adapta-
tion, sensory understanding and interaction1.Inhissem-
inal paper published in 19502, Alan Turing introduced a
test (now called ‘the Turing test’) in which, if an evaluator
cannot distinguish whether intelligent behavior is exhib-
ited by a machine or a human, the machine is said to have
passed the test2. John McCarthy coined the term ‘artificial
intelligence’ soon after3. The Journal of Artificial Intelli-
gence commenced publication in 1970, but it took several
years for computing power to match theoretical possibil-
ities and allow development of modern algorithms.
In simple terms, traditional computational algorithms
are software programs that follow a sequence of rules
and perform an identical function every time, such as
an electronic calculator: ‘if this is the input, then that
is the output’. In contrast, an AI algorithm learns the
rules (function) from training data (input) presented to
it. Major milestones in the history of AI include the Deep
Blue computer outmatching the world chess champion,
Gary Kasparov, in 1997 and AlphaGo defeating one of
the best players (ranked 9-dan) of the ancient Chinese
game of Go, Lee Sedol, in 20164.
Both chess and Go are games that require strategy,
foresight and logic, all of which are qualities typically
attributed to human intelligence. Go is considered much
more difficult for computers than chess, because it involves
far more possible moves (approximately 8 million choices
for three moves as opposed to 40 000 for chess). The
victory in Go represents the progress in computational
algorithms, improved computing infrastructure and access
to enormous amounts of data. The same evolution has led
to several widely popularized AI consumer applications,
including autocomplete on Google search, virtual
assistants (such as Alexa, Cortana, Google Home and
Siri), personalized shopping recommendations, the emer-
gence of automatic self-driving cars and face recognition
(for instance, searching by a face in Google photos).
In clinical medicine, the interest (and recent hype)
in AI technologies stems from their potential to
transform healthcare by deriving new and impor-
tant insights from the vast amount of digital data
generated during delivery of healthcare. Promising
medical AI applications are emerging in the areas of
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd STATE-OF-THE-ART REVIEW
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
State-of-the-Art Review 499
screening5,6, prediction7–9, triage10,11, diagnosis12,13,
drug development14,15,treatment
16,17, monitoring18
and imaging interpretation19,20. Several original studies
published in this Journal have used AI methodology to
evaluate adnexal masses21, the risk of lymph node metas-
tases in endometrial cancer22, pelvic organ function23,24
and breast lesions25– 27, assess aneuploidy risk28,predict
fetal lung maturity29, perinatal outcome30 , shoulder
dystocia31 and brain damage32, estimate gestational age
in late pregnancy33 and classify standard fetal brain
images as normal or abnormal34 (Table 1). The number
of AI-related papers is increasing; at the 29th World
Congress of the International Society of Ultrasound in
Obstetrics and Gynecology (ISUOG) in 2019, there were
14 abstracts specifically mentioning AI, in comparison to
a total of 13 abstracts in the preceding six ISUOG World
Congresses (20132018).
As with any scientific discipline, the AI scientific
community uses technical language and terminology that
can be difficult to understand for those outside the area.
This in addition to the rapid advancement in the field can
make it challenging for other disciplines to keep abreast
of developments in AI. Indeed, one of the key concerns
that has been expressed regarding AI in medicine is that
there are relatively few interdisciplinary professionals
who work at the interface of AI and medicine and can
‘translate’ between the two35. A recent review of 250
AI papers emphasized the need for greater collaboration
between computational scientists and medical profession-
als to generate more scientifically sound and impactful
work integrating knowledge from both domains36.
To contribute to this discussion, this article aims to
explain key AI-related concepts and terms to clinicians in
the field of ultrasound in obstetrics and gynecology. For
simplicity, we use the general term ‘artificial intelligence
(AI)’, which is commonly used by others in the field,
although most articles referring to AI in clinical medicine
are based on deep learning, a subset of AI (Box 1,
Figure 1). It is also important to appreciate that relatively
few AI-based ultrasound applications have advanced the
whole way from academic concept to clinical application
and commercialization. Therefore, we also use examples
from radiology, being our closest sister field.
Artificial intelligence and medical imaging
The current interest in AI in medical imaging stems from
major advances in deep learning-based ‘computer vision’
over the past decade. The field of computer vision concerns
computers that interpret and understand the visual world.
Within computer vision, object recognition (‘what can I
see in this image?’) is a key task which can be posed
as an image classification problem. Researchers in this
field use ‘challenge’ datasets to benchmark the progress
in accuracy of image classification. One such challenge
dataset, called the ImageNet project, is a database of more
than 14 million images of every day (non-medical) objects
that have been labeled by humans into more than 20 000
categories. This large database was first made available
to the scientific community in 2010 to train algorithms
for image classification. In 2015, the ImageNet annual
competition reached a milestone when the error rate of
automatic classification of images dropped below 5%,
which is the average human error rate (Figure S1)17 .This
was largely due to advances in deep learning, the branch
of AI that learns from large amounts of data.
Deep learning excels in pattern recognition and we
believe that medical professions which rely on imaging will
be the first to see the benefits of this tool (Appendix S1).
One of the largest driving forces behind AI in medical
imaging is the enormous amount of digital data gener-
ated around the world that may be useful in training
algorithms. As of May 2020, there are more than 50
deep learning-based imaging applications37 approved by
the USA Food and Drug Administration (FDA) or the
European Union, spanning across most imaging modal-
ities, including X-ray, computerized tomography (CT),
magnetic resonance imaging, retinal optical coherence
tomography and ultrasound. Approved AI applications
are designed to provide increased productivity by per-
forming automated screening, assisting in diagnosis or
prioritizing a radiology study that needs to be ‘at the top
of the list’. Applications include identification of cere-
brovascular accidents, diabetic retinopathy, skeletal frac-
tures, cancer, pulmonary embolism and pneumothorax37.
Recently, the first ultrasound AI application that guides
the user received FDA approval; the software uses AI
Table 1 Examples of reported and expected future artificial intelligence (AI) applications in obstetric and gynecological ultrasound
AI application Description Clinical utility
Probe guidance Operator is guided how to manipulate probe to
acquire fetal biometric plane
Facilitate sonographer training; basic scanning can
be performed by non-expert (e.g. general
practitioner)
Fetal biometric plane finder Standard fetal biometric planes are automatically
acquired, measured and stored
Reduce repetitive caliper adjustment clicks; reduce
operator bias; instant quality control
Anomaly scan completeness Anomaly scan checklist of mandatory planes is
populated automatically
Ensure completeness of imaging and that all parts
of anatomy are checked
Anomaly highlighting Unusual fetal findings are identified in a standard
plane
Highlight suspected abnormal finding; assist
sonographer with referral decision
Cyst classification Ovarian cysts are classified according to IOTA criteria Improve consistency; reduce likelihood of error
Lung scans for Ob/Gyn Ob/Gyn experts are taught how to perform lung
ultrasound in patients with COVID19
Reduce learning curve
COVID19, coronavirus disease 2019; IOTA, International Ovarian Tumor Analysis.
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd Ultrasound Obstet Gynecol 2020; 56: 498–505.
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
500 Drukker et al.
Box 1 Glossary of commonly used artificial intelligence terms
Artificial intelligence (AI) refers to a machine or software performing tasks that would ordinarily require human
brainpower to accomplish, such as making sense of spoken language, learning behaviors or solving problems
(Figure 1). This means that an AI program can learn from real-world data as well as experience, and encompasses
the capacity to improve its performance given more data. Nevertheless, there is no accepted definition of AI, and
therefore, the term is often misused71. AI can be broken down into general AI, which is human-like intelligence (i.e.
ability to think, learn, reason) and narrow AI, which is the ability to perform a specific task (i.e. image detection,
translation, chess-playing).
Convolutional neural networks (CNNs), also known as artificial neural networks, are computational algorithms
inspired by the biological neural networks that constitute animal brains and consist of multilayered artificial neurons
(Figure 1). A CNN is displayed as a system of hidden connections between input and output. CNNs have the ability
to determine the relationship between input (such as brain computerized tomography (CT)) and labels (presence or
absence of hemorrhage). This is in contrast to traditional software, in which predetermined logic rules set the output
to specific stimuli. In reality, there is little resemblance to human neurons.
Black box is the term often used to describe the process occurring inside the hidden layers of CNNs. For example,
a new AI product is launched aimed at detecting intracranial hemorrhage. When this software reads a CT scan that
has signs of intracranial hemorrhage, it will correctly output the result of evidence of intracranial hemorrhage to
the care team, yet it may not report why it reached this conclusion. There is an ongoing effort aimed at providing
‘explainability’ to AI, to report the ‘how’ in addition to the result (Explainable AI).
Explainable AI is an emerging subfield of AI that attempts to explain how black box decisions of AI systems are
made. Explainable AI aims to understand the key steps involved in making computational decisions. This should
theoretically allow decisions taken by an algorithm to be understood by end-users.
Model, application or algorithm are all terms used interchangeably for the ready-to-use AI software/product.
Machine learning is a branch of AI, defined by the ability to learn from data without being explicitly programmed
(Figure 1). Machine learning can be understood as a statistical method that gradually improves as it is exposed to
more data, by extracting patterns from data.
Deep learning is a branch of machine learning (Figure 1). In deep learning, the input and output are connected by
multiple layers of hidden connections, also known as CNNs. Deep learning involves learning from vast amounts
of data and performs especially well in pattern recognition within data; therefore, it can be particularly helpful in
medical imaging. Deep learning is usually divided into two major classes:
1) Supervised learning, in which labeled (annotated) data are used as an input to a CNN (Appendix S1). For example,
to build an application detecting brain hemorrhage on a CT scan, the CNN is first trained using labeled data, i.e.
normal scans and scans with hemorrhage labeled with the correct diagnosis by a radiologist (label =hemorrhage
present/absent). Following training using the training dataset, evaluation of the CNN is carried out using a test
dataset that contains unlabeled data; these are new CT scans (not contained in the training dataset) with and without
hemorrhage that do not have labels. The CNN outputs its prediction based on the test data. After validation of the
prediction accuracy, the model is ready to use. For instance, the final model is a software that can read a brain
CT scan (input =CT scan) and decide whether or not intracranial hemorrhage is present or absent (output =yes/no
hemorrhage).
2) Unsupervised learning is a training process that does not require labeling. This saves the time-consuming,
labor-intensive and expensive human labeling process. In the intracranial hemorrhage example, the learning input
would be CT scans of patients with and without hemorrhage that are not labeled (i.e. the machine is never told if
bleeding is absent or present). The CNN will learn by clustering scans that look similar to one another (learn from
similarities and differences), which should result in classifying images to either hemorrhage or no hemorrhage.
Big data: In order to achieve good performance, supervised AI applications require a large volume of labeled training
data (usually images) from which to learn. Establishing a clinically relevant, well-curated dataset that can be used to
train an algorithm can be a very time-intensive process, and the accuracy of such curation determines the quality of
the derived model.
to help the user capture images of acceptable diagnostic
quality during adult echocardiography38. The market of
AI applications in medical imaging alone is forecasted to
top $2 billion by 202339 .
What about ultrasound? Ultrasound AI software needs
to fit into the workflow differently from, for example,
in the analysis of a CT scan; in ultrasound, real-time
analysis at the point of acquisition is ideally needed,
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd Ultrasound Obstet Gynecol 2020; 56: 498–505.
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
State-of-the-Art Review 501
Convolutional (articial) neural network
(a)
(b)
Deep network
(hidden layers)
Artical intelligence
Algorithms that gradually
improve with experience;
Learning without being
explicitly programmed.
Examples: Amazon shopping
recommendations, email spam
lter, Google search algorithm
Learning from vast
amounts of data
Computer mimic human
behavior: sense, reason,
act and adapt
Example:
Deep Blue chess program
Examples: AlphaGo,
self-driving cars,
ARDA
Machine learning
Deep learning
Output Input Brain neurons OutputInput
Human neural network
Figure 1 Graphic representation of artificial intelligence. (a) Human neural network architecture and its resemblance to a deep artificial
neural network. (b) Relationship between artificial intelligence, machine learning and deep learning. ARDA, automated retinal disease
assessment (Appendix S1).
while in CT, automated reading is only needed at the
end of the examination. Compared to the image acqui-
sition and analysis abilities of a sonologist, no known
current AI method is generic enough to be applied on a
wide range of tasks (e.g. an AI application designed for
the second trimester is unlikely to be applicable to the
first-trimester scan). For each ultrasound task, there are
several image acquisition and analysis capabilities that
can be met by an AI application, including classification
(‘what objects are present in this image?’), segmentation
(‘where are the organ boundaries?’), navigation (‘how
can I acquire the optimal image?’), quality assessment (‘is
this image fit for purpose to make a diagnosis?’) and
diagnosis (‘what is wrong with the imaged object?’).
Active academic research and emerging examples
of AI-assisted applications for ultrasound include
plane-finding (navigation) and automated quantification
for analysis of the breast, prostate, liver and heart40– 42 .
In obstetric and gynecological ultrasound, promis-
ing workload-changing advancements include automatic
detection of standard planes and quality assurance in
fetal ultrasound43– 45, detection of endometrial thickness
in gynecology46 and automatic classification of ovarian
cysts (Table 1).
Challenges
The introduction of AI into clinical practice offers many
potential benefits, but there are also many challenges and
uncertainties that may raise concerns.
The impact of AI on jobs is among the most widely
discussed concerns47– 49 . Major technological advances
frequently impact the job market, and the current wave of
AI-based automation is no exception. However, this does
not automatically imply technological unemployment;
rather, it may trigger a transformation in the way we
work, resulting in professional realignment. AI can
enhance both the value and the professional satisfaction
of sonographers and maternalfetal medicine experts by
reducing the time needed for routine tasks and allowing
more time to perform tasks that add value and influence
patient care49,50. An important advantage that machines
have over humans is reproducibility: machines retain
absolute consistency over time whereas the performance
of a clinician varies depending on many factors, such as
years of experience, fatigue or simple distractions, such
as a late-running clinic or a ringing phone. Additionally,
an AI application has higher capacity, theoretically
being able to read thousands of scans, while a radio-
grapher reads 50100 scans per day49. Evidence in the
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd Ultrasound Obstet Gynecol 2020; 56: 498–505.
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
502 Drukker et al.
literature suggests that the first wave of AI applications
is likely to constitute assistive technology, taking over
repetitive tasks to improve consistency, such as reading
radiographs51. Specifically in ultrasound, automation will
assist in shortening the total scan duration by removing
the need for some of the tiresome or ‘simple’ repetitive
tasks, such as acquiring standard planes, annotation or
adjustment of calipers (Table 1). This may allow more
time to analyze additional scan planes or to communicate
the results to patients52. Automation should also be seen
in the context of a global shortage of imaging experts,
including sonographers and radiographers, while demand
for diagnostic imaging is rising53.
Applicability is another concern relating to the imple-
mentation of AI in clinical medicine. Imaging features
alone are often not sufficient to determine diagnosis and
management. Consider, for instance, an AI application
developed to report on ovarian cysts that is designed to
produce a binary outcome of malignant features being
absent or present based on an ovarian imaging training
dataset. Clinicians also take into account the clinical
context, including many factors such as age, menopausal
status and familial risk factors, when making a diagnosis.
While it could be argued that the clinician may be
biased by clinical information, this example highlights
the importance of understanding when an AI solution is
applicable and when it is not. AI models can account only
for information ‘seen’ during training, so in this example,
non-imaging clinical information is not taken into
account by the AI model. Hence, an important emerging
area of healthcare AI research focuses on building AI
models that integrate imaging and electronic health
record data for ‘personalized diagnostic imaging’54,55.
Another fear, which is largely unwarranted, relates to
adaptable systems, which are AI applications that con-
tinue to learn, adapt and optimize based on new data
and hence may jeopardize the application’s safety. Reg-
ulatory bodies, including the FDA, currently approve
only AI applications with models that have ‘locked’
parameters56,57. This means that all current AI applica-
tions are static models that can no longer adapt, and there-
fore, the approved product does not change over time.
The ‘black box’ design of AI applications is attractive
at one level, as there is no need to understand how
the complex non-linear optimization works, but is also
a source of concern as clinicians want to understand
any associated bias and likely modes of failure. Most
AI models are derived by using ‘supervised learning’,
meaning that the model learns from data annotated by
humans (Box 1). Since human involvement can potentially
introduce bias to the learning process, the resulting model
could also be biased. Understanding model bias is an
important aspect of AI model design and an active area
of research58. For example, as operators seem to be at
risk of expected-value bias when acquiring fetal biometry
measurements, an algorithm training to measure standard
biometric planes by supervised learning might end up
having a built-in bias when automatically calculating
fetal biometry59. To better understand AI model bias, as
well as to provide insights into how AI algorithms make
decisions, ‘explainable AI’ is an emerging subfield of AI
research aiming to demystify the black box design.
Deep learning excels in pattern recognition, but it is
important to recognize that most methods are supervised
(training data are manually annotated). Manual anno-
tation is resource-intensive and is often subjective. Most
academic publications use data annotated a single time
by one or more human annotators, which means that
the derived model will be biased by (or skewed towards)
the human annotator’s preferred method of annotation.
If, instead, each image is annotated by multiple humans,
then there need to be rules about how to agree on
consensus if their annotations differ. There is no one
way to do this. Thus, one can appreciate that the process
of annotation and subsequent data cleaning is both
resource-intensive and determines the success of model
performance. Furthermore, traditional deep-learning
methods require a considerable volume of data to build
accurate models, which are not always available. There
are some ways to address this limitation which are the
subject of current deep learning in medical imaging
research. These include using pre-trained models, which
essentially allow initialization of the parameters of a new
model with those of a model built for another problem,
and allowing new data to update model parameters.
Another issue deep-learning scientists have to consider
is deployability, as traditional deep-learning models can
have millions of parameters and take up lots of computer
memory. Models can be reduced in size empirically, and
there is an emerging area of interest in designing small
deep neural networks, such as MobileNet and SESNet, as
the backbone for deployable AI application models.
Unfortunately, there are high expectations of AI appli-
cations which have yet to be backed up by wide-scale
convincing multicenter clinical studies and, when
appropriate, randomized clinical trials. An interesting
overview of the current standards of AI research in medi-
cal imaging is provided in a recent publication60. Indeed,
most of the reported AI applications to date use data from
a single site and focus on algorithm performance rather
than looking at clinical utility or health economics60.It
is particularly challenging to assess an AI model when
the accuracy of a human expert for the same task is
difficult to determine or is unknown60. It is important to
appreciate that healthcare AI is an emerging technology
and, as such, it will take time to determine the best ways
to validate and regulate AI applications. Towards this
goal, a recent multinational academic report addressing
both medical and non-medical AI systems, entitled
‘Toward Trustworthy AI Development: Mechanisms for
Supporting Verifiable Claims’61, provides a list of mea-
sures and mechanisms for AI developers and regulatory
bodies to ensure responsible AI development. Among
the recommendations, the report calls for introduction
of third-party auditing of AI systems, creating a system
for reporting AI incidents and encouraging researchers in
academia to verify claims made by industry.
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd Ultrasound Obstet Gynecol 2020; 56: 498–505.
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
State-of-the-Art Review 503
No discussion about AI would be complete without
mentioning ethics62. Recently, the classic theoretical
‘trolley problem’ experiment was applied to self-driving
cars, as part of an online experimental platform designed
to explore human perspective on moral decisions made
by autonomous vehicles63. The question is whose safety
should be prioritized in the event of an accident.
Essentially, the problem asks: if your car brakes suddenly
fail as you speed toward a crowded crosswalk, and you are
confronted with the dilemma to veer right and crash into
an old man or veer left and crash into a concrete wall and
kill the car driver and passengers, what would you choose?
Now, what if instead of an old man, it was a woman
pushing a stroller or a homeless person crossing the road?
Human drivers who are badly injured or die in a car crash
cannot report whether they were faced with a dilemma.
However, self-driving cars can be programmed to act in a
certain way. Similarly, the use of AI-based solutions may
raise several moral questions in medicine64,65: would we
trust computers to screen for disease, prioritize treatment,
diagnose, treat, discharge? Would we let a fully automated
AI-based solution choose the patient to occupy the only
available intensive care unit bed?
Ethical concerns also surround the issue of privacy65,66.
Developing AI applications typically requires a large
volume of data about patients and their diagnoses.
Such personal data are usually collected by health
authorities or hospitals. Under what conditions (if any)
should hospitals be allowed to share patient data with
developers of AI solutions, who may be commercial
entities? If healthcare data are completely anonymized,
does a patient need to expressly consent to their use for
such improvements in healthcare? These questions, which
relate to data governance and privacy, are not unique to
healthcare AI and are currently being debated widely by
regulators, policy-makers, technologists and technology
end-users (including the public). An emerging technology
area, called privacy-enhancing technologies, may offer
data-sharing and analysis options to reduce some of the
current barriers and concerns.
Potential professional liability for physicians using
AI is another challenge67. Should hospitals and doctors
be accountable for decisions that an AI application
makes? Information provided by an AI application may
be used to inform clinical management, diagnosis or
treatment. However, algorithms, like humans, can err.
Let us suppose that an AI algorithm classifies an ovarian
cyst as most likely benign and recommends follow-up
imaging in 6 months according to the standard of care;
at the next appointment, the patient is diagnosed with
metastatic ovarian cancer and retrospective image review
suggests that the ‘cyst’ may have had malignant features
previously. This raises the question: who is liable when
AI-based diagnosis is incorrect? Questions of this kind are
currently being considered by regulators, in consultation
with legal professionals, medical professionals and AI
developers in the industry.
Research in context
As we begin to see more interdisciplinary research related
to AI in clinical medicine, difficulties arise when readers
and reviewers with a clinical background attempt to
critically assess the methodology of scientific AI papers in
a field that is, for now, largely unfamiliar to many medical
professionals. How can the clinical research community
ensure that highly technical aspects of a scientific
work have been conducted and presented correctly68?
Ultrasound professionals understand the full meaning
of ‘sonographer with 10 years of experience’ or ‘images
were reviewed by two specialists’, but may struggle with
descriptions such as ‘A feed-forward network of neurons
consisting of a number of layers that are connected to each
other was built.’28 or ‘To train the model, we first provided
the sample input, x, to the first layer and acquired the
best parameters (W, b) and activated the first hidden
layer, y, and then utilized y to predict the second layer.’30.
When assessing the clinical effectiveness and legitimacy of
scientific work for publication, several crucial questions
should be raised, including: which of the authors are AI
scientists and what is their experience; how were training
and test data acquired; what were the input variables;
how was the algorithm trained; how was the algorithm
evaluated and validated, and was the validation internal
or external; are the results reproducible. We believe that
one simple solution is to include in the Editorial Board of
journals technical reviewers with expertise in AI who are
able to ensure the soundness of the technical aspects of a
paper and assess interdisciplinary research.
To facilitate reporting of AI trials, the CONSORT
(Consolidated Standards of Reporting Trials) and SPIRIT
(Standard Protocol Items: Recommendations for Interven-
tional Trials) steering groups are expected to publish the
first international consensus-based reporting guidelines
for clinical trials evaluating AI interventions in 202069.
Summary
AI uses data and algorithms to derive computational
models of tasks that are often as good as (or better than)
humans. AI is already a part of our daily life and is a
prominent source of innovation in healthcare, helping to
develop new drugs, support clinical decisions and provide
quality assurance. Deep learning performs particularly
well in image pattern recognition and solutions based on
this approach can benefit healthcare professionals who
depend heavily on information obtained from images,
such as radiographers, pathologists and sonologists.
We have presented an overview of AI technology and
some of the issues related to the introduction of this
emerging technology into clinical practice, in the context
of ultrasound in obstetrics and gynecology. At this stage,
AI applications are in the early stages of deployment and
a systematic review would be premature. In addition,
performing a clinical systematic review in this area is
challenging because most of the published peer-reviewed
scientific articles appear in the engineering literature which
usually focuses on the AI methodology and few studies
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd Ultrasound Obstet Gynecol 2020; 56: 498–505.
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
504 Drukker et al.
have assessed clinical applicability. Lastly, algorithms and
results of approved AI applications are often not published
in scientific journals due to commercial sensitivities.
In the past, advances in women’s ultrasound have
been largely achieved through better imaging, advances in
education and training, adherence to guidelines and stan-
dards of care, and improvement of genetic technologies70.
Despite all these advances, the fundamental way in
which ultrasound images are acquired and interpreted has
remained relatively unchanged. AI opens an opportunity
to introduce in the patientcarer relationship a third ‘par-
ticipant’ that is able to contribute to healthcare. Improved
quality through automatic categorization or interpreta-
tion of images and ensuring images are fit for purpose
can increase confidence in imaging-based diagnosis. In
high-income settings, this could contribute to health-
care efficiency and workflow improvements in screening.
In under-resourced settings, it opens the prospect of
strengthening ultrasound imaging by replicating basic
obstetric ultrasound where there is none which could
allow, for example, gestational-age estimation or diagno-
sis of placenta previa. For this potential to be realized,
interdisciplinary communication between AI developers
and ultrasound professionals needs to be strengthened. A
greater understanding of how AI methods work is impor-
tant to enable clinicians to trust AI solutions. To ensure
seamless integration of AI, medical professional organi-
zations should start considering how AI affects them,
recommend that physicians publish their experiences of
using AI technologies, and consider appropriate guidelines
or committees on aspects of AI.
ACKNOWLEDGMENTS
The authors acknowledge the European Research
Council (ERC-ADG-2015 694581), UK Engineering and
Physical Sciences Research Council (EP/M013774/1,
EP/R013853/1), UK Medical Research Council
(MR/P027938/1) and The Bill and Melinda Gates
Foundation (Grant no. 49038 and OPP1197123) for
funding their work in this area. A.T.P is supported by the
NIHR Biomedical Research Centres funding scheme. The
views expressed herein are those of the authors and not
necessarily those of the NHS, the NIHR, the Department
of Health or any of the other funders.
Disclosure
J.A.N. and A.T.P. are Senior Scientific Advisors of
Intelligent Ultrasound Ltd.
REFERENCES
1. United Kingdom Engineering and Physical Sciences Research Council. Artificial
intelligence technologies. https://epsrc.ukri.org/research/ourportfolio/researchareas/
ait/.
2. Turing AM. I–Computing Machinery and Intelligence. Mind 1950; LIX: 433 –460.
3. McCarthy J, Minsky M, Rochester N, Shannon C. A proposal for the dartmouth
summer research project on artificial intelligence, August 1955. http://www-formal
.stanford.edu/jmc/history/dartmouth/dartmouth.html
4. Bory P. Deep new: The shifting narratives of artificial intelligence from Deep Blue to
AlphaGo. Convergence 2019; 25: 627– 642.
5. Abramoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous
AI-based diagnostic system for detection of diabetic retinopathy in primary care
offices. NPJ Digit Med 2018; 1: 39.
6. Wang P, Berzin TM, Glissen Brown JR, Bharadwaj S, Becq A, Xiao X, Liu P, Li L,
Song Y, Zhang D, Li Y, Xu G, Tu M, Liu X. Real-time automatic detection system
increases colonoscopic polyp and adenoma detection rates: a prospective randomised
controlled study. Gut 2019; 68: 1813– 1819.
7. Rezaii N, Walker E, Wolff P. A machine learning approach to predicting psychosis
using semantic density and latent content analysis. NPJ Schizophr 2019; 5:9.
8. Artzi NS, Shilo S, Hadar E, Rossman H, Barbash-Hazan S, Ben-Haroush A, Balicer
RD, Feldman B, Wiznitzer A, Segal E. Prediction of gestational diabetes based on
nationwide electronic health records. Nat Med 2020; 26: 71– 76.
9. Makino M, Yoshimoto R, Ono M, Itoko T, Katsuki T, Koseki A, Kudo M,
Haida K, Kuroda J, Yanagiya R, Saitoh E, Hoshinaga K, Yuzawa Y, Suzuki A.
Artificial intelligence predicts the progression of diabetic kidney disease using big
data machine learning. Sci Rep 2019; 9: 11862.
10. Annarumma M, Withey SJ, Bakewell RJ, Pesce E, Goh V, Montana G. Automated
Triaging of Adult Chest Radiographs with Deep Artificial Neural Networks.
Radiology 2019; 291: 196– 202.
11. Titano JJ, Badgeley M, Schefflein J, Pain M, Su A, Cai M, Swinburne N, Zech J,
Kim J, Bederson J, Mocco J, Drayer B, Lehar J, Cho S, Costa A, Oermann EK.
Automated deep-neural-network surveillance of cranial images for acute neurologic
events. Nat Med 2018; 24: 1337– 1341.
12. Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, Mahendiran T, Moraes G,
Shamdas M, Kern C, Ledsam JR, Schmid MK, Balaskas K, Topol EJ, Bachmann
LM, Keane PA, Denniston AK. A comparison of deep learning performance against
health-care professionals in detecting diseases from medical imaging: a systematic
review and meta-analysis. Lancet Digit Health 2019; 1: e271– e297.
13. Marcus GM. The Apple Watch can detect atrial fibrillation: so what now? Nat Rev
Cardiol 2020; 17: 135– 136.
14. Ianevski A, Giri AK, Gautam P, Kononov A, Potdar S, Saarela J, Wennerberg K,
Aittokallio T. Prediction of drug combination effects with a minimal set of
experiments. Nat Mach Intell 2019; 1: 568– 577.
15. Wang X, Terashi G, Christoffer CW, Zhu M, Kihara D. Protein Docking Model
Evaluation by 3D Deep Convolutional Neural Networks. Bioinformatics 2020; 36:
2113– 2118.
16. Zhdanov A, Atluri S, Wong W, Vaghei Y, Daskalakis ZJ, Blumberger DM,
Frey BN, Giacobbe P, Lam RW, Milev R, Mueller DJ, Turecki G, Parikh SV,
Rotzinger S, Soares CN, Brenner CA, Vila-Rodriguez F, McAndrews MP, Kleffner K,
Alonso-Prieto E, Arnott SR, Foster JA, Strother SC, Uher R, Kennedy SH, Farzan F.
Use of Machine Learning for Predicting Escitalopram Treatment Outcome From
Electroencephalography Recordings in Adult Patients With Depression. JAMA Netw
Open 2020; 3: e1918377.
17. Langlotz CP, Allen B, Erickson BJ, Kalpathy-Cramer J, Bigelow K, Cook TS, Flanders
AE, Lungren MP, Mendelson DS, Rudie JDJR. A roadmap for foundational research
on artificial intelligence in medical imaging: From the 2018 NIH/RSNA/ACR/The
Academy Workshop. Radiology 2019; 291: 781– 791.
18. Safavi KC, Khaniyev T, Copenhaver M, Seelen M, Zenteno Langle AC, Zanger J,
Daily B, Levi R, Dunn P. Development and Validation of a Machine Learning Model
to Aid Discharge Processes for Inpatient Surgical Care. JAMA Netw Open 2019; 2:
e1917221.
19. Chang PJ. Moving Artificial Intelligence from Feasible to Real: Time to Drill for Gas
and Build Roads. Radiology 2020; 294: 432– 433.
20. Majkowska A, Mittal S, Steiner DF, Reicher JJ, McKinney SM, Duggan GE,
Eswaran K, Cameron Chen PH, Liu Y, Kalidindi SR, Ding A, Corrado GS, Tse D,
Shetty S. Chest Radiograph Interpretation with Deep Learning Models: Assess-
ment with Radiologist-adjudicated Reference Standards and Population-adjusted
Evaluation. Radiology 2020; 294: 421– 431.
21. Timmerman D, Verrelst H, Bourne TH, De Moor B, Collins WP, Vergote I,
Vandewalle J. Artificial neural network models for the preoperative discrimination
between malignant and benign adnexal masses. Ultrasound Obstet Gynecol 1999;
13: 17– 25.
22. Eriksson LSE, Epstein E, Testa AC, Fischerova D, Valentin L, Sladkevicius P,
Franchi D, Fruhauf F, Fruscio R, Haak LA, Opolskiene G, Mascilini F, Alcazar JL,
Van Holsbeke C, Chiappa V, Bourne T, Lindqvist PG, Van Calster B, Timmerman D,
Verbakel JY, Van den Bosch T, Wynants L. Ultrasound-based risk model for
preoperative prediction of lymph-node metastases in women with endometrial cancer:
model-development study. Ultrasound Obstet Gynecol 2020; 56: 443– 452.
23. Huang YL, Chen HY. Computer-aided diagnosis of urodynamic stress incontinence
with vector-based perineal ultrasound using neural networks. Ultrasound Obstet
Gynecol 2007; 30: 1002– 1006.
24. van den Noort F, van der Vaart CH, Grob ATM, van de Waarsenburg MK, Slump
CH, van Stralen M. Deep learning enables automatic quantitative assessment of
puborectalis muscle and urogenital hiatus in plane of minimal hiatal dimensions.
Ultrasound Obstet Gynecol 2019; 54: 270– 275.
25. Huang YL, Kuo SJ, Chang CS, Liu YK, Moon WK, Chen DR. Image retrieval
with principal component analysis for breast cancer diagnosis on various ultrasonic
systems. Ultrasound Obstet Gynecol 2005; 26: 558– 566.
26. Kuo SJ, Hsiao YH, Huang YL, Chen DR. Classification of benign and malignant
breast tumors using neural networks and three-dimensional power Doppler
ultrasound. Ultrasound Obstet Gynecol 2008; 32: 97– 102.
27. Huang YL, Chen DR, Jiang YR, Kuo SJ, Wu HK, Moon WK. Computer-aided
diagnosis using morphological features for classifying breast lesions on ultrasound.
Ultrasound Obstet Gynecol 2008; 32: 565– 572.
28. Neocleous AC, Syngelaki A, Nicolaides KH, Schizas CN. Two-stage approach for
risk estimation of fetal trisomy 21 and other aneuploidies using computational
intelligence systems. Ultrasound Obstet Gynecol 2018; 51: 503– 508.
29. Bonet-CarneE, Palacio M, Cobo T, Perez-Moreno A, Lopez M, Piraquive JP, Ramirez
JC, Botet F, Marques F, Gratacos E. Quantitative ultrasound texture analysis of fetal
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd Ultrasound Obstet Gynecol 2020; 56: 498–505.
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
State-of-the-Art Review 505
lungs to predict neonatal respiratory morbidity. Ultrasound Obstet Gynecol 2015;
45: 427– 433.
30. Bahado-Singh RO, Sonek J, McKenna D, Cool D, Aydas B, Turkoglu O, Bjorndahl T,
Mandal R, Wishart D, Friedman P, Graham SF, Yilmaz A. Artificial intelligence and
amniotic fluid multiomics: prediction of perinatal outcome in asymptomatic women
with short cervix. Ultrasound Obstet Gynecol 2019; 54: 110– 118.
31. Tsur A, Batsry L, Toussia-Cohen S, Rosenstein MG, Barak O, Brezinov Y,
Yoeli-Ullman R, Sivan E, Sirota M, Druzin ML, Stevenson DK, Blumenfeld YJ,
Aran D. Development and validation of a machine-learning model for prediction of
shoulder dystocia. Ultrasound Obstet Gynecol 2020; 56: 588– 596.
32. Jugovic D, Tumbri J, Medic M, Jukic MK, Kurjak A, Arbeille P, Salihagic-Kadic A.
New Doppler index for prediction of perinatal brain damage in growth-restricted
and hypoxic fetuses. Ultrasound Obstet Gynecol 2007; 30: 303– 311.
33. Papageorghiou AT, Kemp B, Stones W, Ohuma EO, Kennedy SH, Purwar M,
Salomon LJ, Altman DG, Noble JA, Bertino E, Gravett MG, Pang R, Cheikh Ismail L,
Barros FC, Lambert A, Jaffer YA, Victora CG, Bhutta ZA, Villar J, International Fetal
and Newborn Growth Consortium for the 21st Century (INTERGROWTH-21st).
Ultrasound-based gestational-age estimation in late pregnancy. Ultrasound Obstet
Gynecol 2016; 48: 719– 726.
34. Xie HN, Wang N, He M, Zhang LH, Cai HM, Xian JB, Lin MF, Zheng J, Yang YZ.
Using deep-learning algorithms to classify fetal brain ultrasound images as normal
or abnormal. Ultrasound Obstet Gynecol 2020; 56: 579– 587.
35. Gobet F. The Future of Expertise: The Need for a Multidisciplinary Approach.
Journal of Expertise 2018; 1: 107– 113.
36. Littmann M, Selig K, Cohen-Lavi L, Frank Y, H¨
onigschmid P, Kataka E, M ¨
osch A,
Qian K, Ron A, Schmid S, Sorbie A, Szlak L, Dagan-Wiener A, Ben-Tal N, Niv MY,
Razansky D, Schuller BW, Ankerst D, Hertz T, Rost B. Validity of machine learning
in biology and medicine increased through collaborations across fields of expertise.
Nat Mach Intell 2020; 2: 18– 24.
37. American College of Radiology Data Science Institute. FDA Cleared AI Algorithms.
https://www.acrdsi.org/DSI-Services/FDA- Cleared- AI-Algorithms [Accessed May
7th, 2020].
38. Food and Drug Administration. FDA Authorizes Marketing of First Cardiac
Ultrasound Software That Uses Artificial Intelligence to Guide User https://www
.fda.gov/news-events/press- announcements/fda- authorizes-marketing- first-cardiac-
ultrasound-software- uses-artificial- intelligence-guide-user. 2020.
39. Harris S. Signify Research. Artificial Inelligence in Medical Imaging to Top $2 Billion
by 2023. https://www.signifyresearch.net/medical-imaging/ai- medical- imaging-top-
2-billion- 2023/ [Accessed March 2nd, 2020].
40. Ghorbani A, Ouyang D, Abid A, He B, Chen JH, Harrington RA, Liang DH, Ashley
EA, Zou JY. Deep learning interpretation of echocardiograms. NPJ Digit Med 2020;
3: 10.
41. Liu S, Wang Y, Yang X, Lei B, Liu L, Li SX, Ni D, Wang T. Deep Learning in
Medical Ultrasound Analysis: A Review. Engineering 2019; 5: 261– 275.
42. Ouyang D, He B, Ghorbani A, Yuan N, Ebinger J, Langlotz CP, Heidenreich PA,
Harrington RA, Liang DH, Ashley EA, Zou JY. Video-based AI for beat-to-beat
assessment of cardiac function. Nature 2020; 580: 252– 256.
43. Sharma H, Droste R, Chatelain P, Drukker L, Papageorghiou AT, Noble JA.
Spatio-Temporal Partitioning And Description Of Full-Length Routine Fetal Anomaly
Ultrasound Scans. Proc IEEE Int Symp Biomed Imaging 2019; 16: 987– 990.
44. Baumgartner CF, Kamnitsas K, Matthew J, Smith S, Kainz B, Rueckert D. Real-Time
Standard Scan Plane Detection and Localisation in Fetal Ultrasound Using Fully
Convolutional Neural Networks. In Medical Image Computing and Computer-
Assisted Intervention – MICCAI 2016. Ourselin S, Joskowicz L, Sabuncu MR,
Unal G, Wells W (eds). Springer International Publishing: Cham, 2016; 203– 211.
45. Chen H, Wu L, Dou Q, Qin J, Li S, Cheng J, Ni D, Heng P. Ultrasound Standard
Plane Detection Using a Composite Neural Network Framework. IEEE Trans Cybern
2017; 47: 1576– 1586.
46. Singhal N, Mukherjee S, Perrey C. Automated assessment of endometrium from
transvaginal ultrasound using Deep Learned Snake. Presented at 2017 IEEE 14th
International Symposium on Biomedical Imaging, 2017; 283– 286.
47. Allen B, Dreyer K, McGinty GD. Integrating Artificial Intelligence Into Radiologic
Practice: A Look to the Future. JAmCollRadiol2020; 17: 280– 283.
48. Frank MR, Autor D, Bessen JE, Brynjolfsson E, Cebrian M, Deming DJ, Feldman M,
Groh M, Lobo J, Moro E, Wang D, Youn H, Rahwan I. Toward understanding
the impact of artificial intelligence on labor. Proc Natl Acad Sci USA 2019; 116:
6531– 6539.
49. Topol EJ. Chapter six: Doctors and Patterns. In: Deep medicine: how artificial
intelligence can make healthcare human again (first edn). Basic Books: New York,
2019; 111– 135.
50. Recht M, Bryan RN. Artificial Intelligence: Threat or Boon to Radiologists? JAm
Coll Radiol 2017; 14: 1476– 1480.
51. Mazurowski MA. Artificial Intelligence May Cause a Significant Disruption to the
Radiology Workforce. JAmCollRadiol2019; 16: 1077– 1082.
52. Noseworthy J. The Future of Care - Preserving the Patient-Physician Relationship.
NEnglJMed2019; 381: 2265– 2269.
53. Waring L, Miller PK, Sloane C, Bolton G. Charting the practical dimensions
of understaffing from a managerial perspective: The everyday shape of the UK’s
sonographer shortage. Ultrasound 2018; 26: 206– 213.
54. Nelson CA, Butte AJ, Baranzini SE. Integrating biomedical research and electronic
health records to create knowledge-based biologically meaningful machine-readable
embeddings. Nat Commun 2019; 10: 3045.
55. Kansagra AP, Yu JP, Chatterjee AR, Lenchik L, Chow DS, Prater AB, Yeh J, Doshi
AM, Hawkins CM, Heilbrun ME, Smith SE, Oselkin M, Gupta P, Ali S. Big Data
and the Future of Radiology Informatics. Acad Radiol 2016; 23: 30– 42.
56. Chiappa V, Bogani G, Ditto A, Martinelli F, Murru G, Raspagliesi F. OP07.10:
Artificial intelligence weights the importance of clinical and sonographic factors
predicting nodal metastasis in endometrial cancer. Ultrasound Obstet Gynecol 2019;
54 (S1): 107– 107.
57. Babic B, Gerke S, Evgeniou T, Cohen IG. Algorithms on regulatory lockdown in
medicine. Science 2019; 366: 1202– 1204.
58. Wiens J, Price WN, Sjoding MW. Diagnosing bias in data-driven algorithms for
healthcare. Nat Med 2020; 26: 25– 26.
59. Drukker L, Droste R, Chatelain P, Noble JA, Papageorghiou AT. Expected-value
bias in routine third-trimester growth scans. Ultrasound Obstet Gynecol 2020; 55:
375– 382.
60. Nagendran M, Chen Y, Lovejoy CA, Gordon AC, Komorowski M, Harvey H, Topol
EJ, Ioannidis JPA, Collins GS, Maruthappu M. Artificial intelligence versus clinicians:
systematic review of design, reporting standards, and claims of deep learning studies.
BMJ 2020; 368: m689.
61. Brundage M, Avin S, Wang J, Belfield H, Krueger G, Hadfield G, Khlaaf H, Yang J,
Toner H, Fong R, Maharaj T, Koh PW, Hooker S, Leung J, Trask A, Bluemke E,
Lebensold J, O’Keefe C, Koren M, Ryffel T, Rubinovitz JB, Besiroglu T, Carugati F,
Clark J, Eckersley P, de Haas S, Johnson M, Laurie B, Ingerman A, Krawczuk I,
Askell A, Cammarota R, Lohn A, Krueger D, Stix C, Henderson P, Graham L,
Prunkl C, Martin B, Seger E, Zilberman N, ´
Oh
´
Eigeartaigh S, Kroeger F, Sastry G,
Kagan R, Weller A, Tse B, Barnes E, Dafoe A, Scharre P, Herbert-Voss A, Rasser M,
Sodhani S, Flynn C, Gilbert TK, Dyer L, Khan S, Bengio Y, Anderljung M. Toward
Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims. April
2020. https://arxiv.org/pdf/2004.07213.pdf
62. Institute for Ethics in AI. University of Oxford. https://www.schwarzmancentre.ox
.ac.uk/ethicsinai [Accessed May 7th, 2020].
63. Awad E, Dsouza S, Kim R, Schulz J, Henrich J, Shariff A, Bonnefon JF, Rahwan I.
The Moral Machine experiment. Nature 2018; 563: 59– 64.
64. Char DS, Shah NH, Magnus D. Implementing Machine Learning in Health Care -
Addressing Ethical Challenges. NEnglJMed2018; 378: 981– 983.
65. Mittelstadt B. Principles alone cannot guarantee ethical AI. Nat Mach Intell 2019; 1:
501– 507.
66. Geis JR, Brady AP, Wu CC, Spencer J, Ranschaert E, Jaremko JL, Langer SG,
Borondy Kitts A, Birch J, Shields WF, van den Hoven van Genderen R, Kotter E,
Wawira Gichoya J, Cook TS, Morgan MB, Tang A, Safdar NM, Kohli M. Ethics
of Artificial Intelligence in Radiology: Summary of the Joint European and North
American Multisociety Statement. Radiology 2019; 293: 436– 440.
67. Price WN 2nd, Gerke S, Cohen IG. Potential Liability for Physicians Using Artificial
Intelligence. JAMA 2019. DOI: 10.1001/jama.2019.15064.
68. Liu Y, Chen PC, Krause J, Peng L. How to Read Articles That Use Machine Learning:
Users’ Guides to the Medical Literature. JAMA 2019; 322: 1806– 1816.
69. CONSORT-AI and SPIRIT-AI Steering Group. Reporting guidelines for clinical
trials evaluating artificial intelligence interventions are needed. Nat Med 2019; 25:
1467– 1468.
70. Abu-Rustum RS, Abuhamad AZ. Fetal imaging: past, present, and future. A journey
of marvel. BJOG 2018; 125: 1568.
71. The Alan Turing Institute. Frequently Asked Questions. https://www.turing.ac.uk/
about-us/frequently- asked-questions [Accessed December 20th, 2019].
SUPPORTING INFORMATION ON THE INTERNET
The following supporting information may be found in the online version of this article:
Appendix S1 Example: ophthalmology at the forefront of artificial intelligence
Figure S1 Error rates on ImageNet Large-Scale Visual Recognition Challenge between 2010 and 2017.
Accuracy improved dramatically with introduction of deep learning in 2012 and continued to improve
thereafter. Humans perform with an error rate of approximately 5%. Figure reproduced with permission from
Langlotz et al.17.
©2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd Ultrasound Obstet Gynecol 2020; 56: 498–505.
on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
... To predict patient outcomes and illness patterns, artificial intelligence (AI)-driven prediction models examine both historical and current data from electronic health records (EHRs), claims databases, and other sources (Harry, 2023). These models are capable of predicting hospital readmissions, identifying high-risk patients for preventive measures, and optimizing healthcare delivery methods (Drukker, Noble, & Papageorghiou, 2020). Healthcare practitioners can lower healthcare costs, better distribute resources based on anticipated patient needs, and manage chronic diseases proactively by utilizing predictive analytics (Chan & Petrikat, 2023). ...
Chapter
Full-text available
The merging of artificial intelligence (AI) with healthcare in recent years has signaled the beginning of a revolutionary period in patient care and medical practice. The integration of AI technologies in the context of smart healthcare is examined in this chapter, with a focus on how these technologies might improve patient outcomes. We start by describing the basic ideas of artificial intelligence (AI) and how they relate to many aspects of healthcare, such as treatment planning, diagnosis, customized medicine, and operational effectiveness. The topic also includes the latest developments in artificial intelligence (AI) technologies, including predictive analytics, natural language processing, and machine learning algorithms, as well as their useful applications in healthcare environments. A thorough analysis of case studies demonstrates how AI-driven solutions are increasing the precision of diagnoses, enhancing treatment plans, and simplifying administrative procedures. The chapter also discusses the difficulties and constraints associated with integrating AI, such as the need for strong legal frameworks and validation, algorithmic bias, and data privacy issues. We can provide more proactive and individualized patient care by incorporating AI into healthcare systems, which will eventually improve patient outcomes and streamline the delivery of healthcare. This chapter offers a nuanced perspective on the potential of artificial intelligence (AI) to alter healthcare practices through a thorough analysis of current advances and empirical data. It also highlights future prospects for research and development in the field of smart healthcare.
... AI and machine learning (ML) have the potential to revolutionise ultrasound training in obstetrics [30]. These technologies offer new opportunities to improve training by enabling personalised-learning pathways, automated-image analysis and instant feedback [31]. ...
Article
Full-text available
Introduction Ultrasound technology is critical in obstetrics, enabling detailed examination of the fetus and maternal anatomy. However, increasing complexity demands specialised training to maximise its potential. This study explores innovative approaches to ultrasound training in obstetrics, focussing on enhancing diagnostic skills and patient safety. Methods This review examines recent innovations in ultrasound training, including competency-based medical education (CBME), simulation technologies, technology-based resources, artificial intelligence (AI), and online-learning platforms. Traditional training methods such as theoretical learning, practical experience, and peer learning are also discussed to provide a comprehensive view of current practises. Results Innovations in ultrasound training include the use of high-fidelity simulators, virtual reality (VR), augmented reality (AR), and hybrid-learning platforms. Simulation technologies offer reproducibility, risk-free learning, diverse scenarios, and immediate feedback. AI and machine learning facilitate personalised-learning paths, real-time feedback, and automated-image analysis. Online-learning platforms and e-learning methods provide flexible, accessible, and cost-effective education. Gamification enhances learning motivation and engagement through educational games and virtual competitions. Discussion The integration of innovative technologies in ultrasound training significantly improves diagnostic skills, learner confidence, and patient safety. However, challenges such as high costs, the need for comprehensive instructor training, and integration into existing programs must be addressed. Standardisation and certification ensure high-quality and consistent training. Future developments in AI, VR, and 3D printing promise further advancements in ultrasound education. Conclusion Innovations in ultrasound training in obstetrics offer significant improvements in medical education and patient care. The successful implementation and continuous development of these technologies are crucial to meet the growing demands of modern obstetrics.
... In comparison to these common medical imaging modalities, US imaging stands out for its nonradiative nature, low-cost, real-time capabilities and painless procedures . These merits make it particularly suitable for various clinical needs (Huang et al., 2018), such as thyroid nodule detection (Bi et al., 2023), breast tumor screening (Xian et al., 2018), and gynecologic examinations (Drukker et al., 2020). ...
Article
Full-text available
Introduction Ultrasound imaging has become a crucial tool in medical diagnostics, offering real-time visualization of internal organs and tissues. However, challenges such as low contrast, high noise levels, and variability in image quality hinder accurate interpretation. To enhance the diagnostic accuracy and support treatment decisions, precise segmentation of organs and lesions in ultrasound image is essential. Recently, several deep learning methods, including convolutional neural networks (CNNs) and Transformers, have reached significant milestones in medical image segmentation. Nonetheless, there remains a pressing need for methods capable of seamlessly integrating global context with local fine-grained information, particularly in addressing the unique challenges posed by ultrasound images. Methods In this paper, to address these issues, we propose DDTransUNet, a hybrid network combining Transformer and CNN, with a dual-branch encoder and dual attention mechanism for ultrasound image segmentation. DDTransUNet adopts a Swin Transformer branch and a CNN branch to extract global context and local fine-grained information. The dual attention comprising Global Spatial Attention (GSA) and Global Channel Attention (GCA) modules to capture long-range visual dependencies. A novel Cross Attention Fusion (CAF) module effectively fuses feature maps from both branches using cross-attention. Results Experiments on three ultrasound image datasets demonstrate that DDTransUNet outperforms previous methods. In the TN3K dataset, DDTransUNet achieves IoU, Dice, HD95 and ACC metrics of 73.82%, 82.31%, 16.98 mm, and 96.94%, respectively. In the BUS-BRA dataset, DDTransUNet achieves 80.75%, 88.23%, 8.12 mm, and 98.00%. In the CAMUS dataset, DDTransUNet achieves 82.51%, 90.33%, 2.82 mm, and 96.87%. Discussion These results indicate that our method can provide valuable diagnostic assistance to clinical practitioners.
... It is now widely recognized, by leading US equipment manufacturers and most of the experts in this field, that there are clear benefits to utilizing AI technologies for US imaging in prenatal diagnostics. A multitude of convolutional neural network (CNN)-based AI applications in US imaging have showcased that AI models can achieve a comparable performance to clinicians in obtaining the appropriate diagnostic image planes, applying appropriate fetal biometric measurements, and accurately assessing abnormal fetal conditions [10][11][12][13]17,18]. ...
Article
Full-text available
The detailed sonographic assessment of the fetal neuroanatomy plays a crucial role in prenatal diagnosis, providing valuable insights into timely, well-coordinated fetal brain development and detecting even subtle anomalies that may impact neurodevelopmental outcomes. With recent advancements in artificial intelligence (AI) in general and medical imaging in particular, there has been growing interest in leveraging AI techniques to enhance the accuracy, efficiency, and clinical utility of fetal neurosonography. The paramount objective of this focusing review is to discuss the latest developments in AI applications in this field, focusing on image analysis, the automation of measurements, prediction models of neurodevelopmental outcomes, visualization techniques, and their integration into clinical routine.
... С другой стороны, ИИ изучает правила (функции) посредством обучения (вводных) данных. В недалёком будущем ИИ может значительно изменить систему здравоохранения, производя новые концептуальные решения из огромного количества цифровых данных, полученных в ходе диагностики и/или лечения пациентов [6]. Он использует огромные объёмы полученных данных для решения заранее определённых задач. ...
Article
In recent decades, neural networks have been widely applied in many fields of science and medicine. Accurate and early diagnosis of malignancies is a key challenge in oncology. Neural networks can analyse a wide range of medical data and identify relationships between qualitative and quantitative features. This allows for more precise and timely diagnoses. Moreover, they can be used to predict tumour progression, evaluate treatment effectiveness, and optimise treatment plans for each patient In oncourology, the use of neural networks offers new perspectives for the diagnosis, prognosis, and treatment of various cancer conditions related to the urinary tract and male reproductive system. This review article explores how neural networks are being used in this field and present research into the use of neural networks for diagnosing, predicting the course and treating urological oncological diseases. The advantages and limitations of using neural networks in this field are demonstrated, and possible directions for future research are suggested. The application of neural networks in oncourology opens new horizons for the development of a personalised approach to diagnosing and treating oncological diseases. Artificial intelligence has the potential to become a powerful tool for improving the accuracy of patient outcome predictions and reducing undesirable side effects of therapy. Introducing neural networks into oncourological practice creates new opportunities for enhancing the work of healthcare organisations and improving the quality of care provided to patients. This can lead to better treatment outcomes and improved patient satisfaction.
Article
Background The WHO’s recommendations on antenatal care underscore the need for ultrasound assessment during pregnancy. Given that maternal and perinatal mortality remains unacceptably high in underserved regions, these guidelines are imperative for achieving better outcomes. In recent years, portable ultrasound devices have become increasingly popular in resource-constrained environments due to their cost-effectiveness, useability, and adoptability in resource-constrained settings. This desk review presents the capabilities and costs of currently available portable ultrasound devices, and is meant to serve as a resource for clinicians and researchers in the imaging community. Methods A list of ideal technical features for portable ultrasound devices was developed in consultation with subject matter experts (SMEs). Features included image acquisition modes, cost, portability, compatibility, connectivity, data storage and security, and regulatory certification status. Information on each of the devices was collected from publicly available information, input from SMEs and/or discussions with company representatives. Results 14 devices were identified and included in this review. The output is meant to provide objective information on ideal technical features for available ultrasound systems to researchers and clinicians working in obstetric ultrasound in low-resource settings. No product endorsements are provided. Conclusions This desk review provides an overview of the landscape of low-cost portable ultrasound probes for use in obstetrics in resource-constrained environments, and provides a description of key capabilities and costs for each. Methods could be applied to mapping the landscape of portable ultrasound devices for other clinical applications, or may be extended to reviewing other types of healthcare technologies. Further studies are recommended to evaluate portable ultrasound devices for usability and durability in global field settings.
Article
The use of artificial intelligence (AI) platforms is revolutionizing the performance in managing metadata and big data. Medicine is another field where AI is spreading. However, this technological advancement is not amenable to errors or fraudulent misconducts. International organization and recently the European Union have released principles and recommendations for an appropriate use of AI in healthcare. In prenatal ultrasound diagnosis, the use of AI in daily practice is having a revolutionary impact. Notwithstanding, the diagnostic enhancement should be regulated, and AI applications should be developed to guarantee correct imaging acquisition and further postprocessing.
Article
Full-text available
Artificial Intelligence (AI)-based algorithms are increasingly entering clinical practice, aiding in the assessment of fetal anatomy and biometry. One such tool for evaluating the fetal head and central nervous system structures is SonoCNS™, which delineates appropriate planes for measuring head circumference (HC), biparietal diameter (BPD), occipitofrontal diameter (OFD), transcerebellar diameter (TCD), width of the posterior horn of the lateral ventricle (Vp), and cisterna magna (CM) based on a 3D volume acquired at the level of the fetal head’s thalamic plane. This study aimed to evaluate the intra- and interobserver variability of measurements obtained using this software. The study included 381 patients, 270 in their second trimester of pregnancy (70%) and 111 in the third trimester. Each patient underwent manual biometric measurements of the aforementioned structures and twice using the SonoCNS software. We calculated the intraobserver variability between the manual measurements and the average of the automated measurements, as well as the interobserver variability for automated measurements. We also compared the median examination time for manual and automated measurements. The interclass correlation coefficients (ICC) for interobserver and intraobserver variability for parameters BPD, HC, and OFD ranged from good to excellent reproducibility in the general population and subgroups (> 0.75). CM and Vp measurements, both in the general population and subgroups, fell into the category of moderate (0.5–0.75) and poor reproducibility (< 0.5). TCD measurements showed moderate (> 0.5) to good reproducibility (0.75–0.9), and OFD showed good and excellent reproducibility. The assessment of the biometry of fetal head structures using SonoCNS took an average of 63 s compared to 14 s for manual measurement (p < 0.001). The SonoCNS™ software is characterized by good to excellent reproducibility and repeatability in the measurement of fetal skull biometry (BPD, HC, and OFD), with poorer performance in measurements of intracranial structures (CM, Vp, TCD). Apart from biometric parameters, the software is useful in clinical practice for delineating appropriate planes from the acquired volume of the fetal head and shortening examination time.
Article
Full-text available
This paper goes through the evolution of the ultrasound machine. It examines the various inventions and improvements to the machine over time, starting with the first 2D scanning machine in 1958 to present-day portable point-of-care machines. This paper will further explore the various uses of the ultrasound machine in obstetrics and gynecology over time, including the more recent developments in artificial intelligence. There have been many different modifications and improvements to ultrasound machines over the years making technology increasingly more valuable. Although the machine is used widely in various fields of medicine, it has significantly impacted obstetrics and gynecology.
Chapter
Artificial intelligence (AI) has emerged as a transformative technology in the field of obstetrics and gynecology. With its ability to analyze vast amounts of data, recognize patterns, and make predictions, AI has the potential to revolutionize health care delivery, improve patient outcomes, and transform the way medical professionals provide care. In gynecological disorders, AI can enhance diagnostic accuracy by quickly analyzing medical images, leading to early detection of subtle abnormalities and more accurate diagnoses. Additionally, AI can tailor treatment plans based on individual medical history, genetics, and lifestyle, optimizing patient care and reducing trial-and-error approaches. By automating administrative tasks and providing personalized support, AI allows health care providers to focus on patient care and strengthen doctor-patient relationships. However, implementing AI in health care requires addressing ethical concerns, patient privacy, informed consent, and transparent AI algorithms through collaborative efforts among health care professionals, AI developers, and regulatory bodies.
Article
Full-text available
Objective To systematically examine the design, reporting standards, risk of bias, and claims of studies comparing the performance of diagnostic deep learning algorithms for medical imaging with that of expert clinicians. Design Systematic review. Data sources Medline, Embase, Cochrane Central Register of Controlled Trials, and the World Health Organization trial registry from 2010 to June 2019. Eligibility criteria for selecting studies Randomised trial registrations and non-randomised studies comparing the performance of a deep learning algorithm in medical imaging with a contemporary group of one or more expert clinicians. Medical imaging has seen a growing interest in deep learning research. The main distinguishing feature of convolutional neural networks (CNNs) in deep learning is that when CNNs are fed with raw data, they develop their own representations needed for pattern recognition. The algorithm learns for itself the features of an image that are important for classification rather than being told by humans which features to use. The selected studies aimed to use medical imaging for predicting absolute risk of existing disease or classification into diagnostic groups (eg, disease or non-disease). For example, raw chest radiographs tagged with a label such as pneumothorax or no pneumothorax and the CNN learning which pixel patterns suggest pneumothorax. Review methods Adherence to reporting standards was assessed by using CONSORT (consolidated standards of reporting trials) for randomised studies and TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) for non-randomised studies. Risk of bias was assessed by using the Cochrane risk of bias tool for randomised studies and PROBAST (prediction model risk of bias assessment tool) for non-randomised studies. Results Only 10 records were found for deep learning randomised clinical trials, two of which have been published (with low risk of bias, except for lack of blinding, and high adherence to reporting standards) and eight are ongoing. Of 81 non-randomised clinical trials identified, only nine were prospective and just six were tested in a real world clinical setting. The median number of experts in the comparator group was only four (interquartile range 2-9). Full access to all datasets and code was severely limited (unavailable in 95% and 93% of studies, respectively). The overall risk of bias was high in 58 of 81 studies and adherence to reporting standards was suboptimal (<50% adherence for 12 of 29 TRIPOD items). 61 of 81 studies stated in their abstract that performance of artificial intelligence was at least comparable to (or better than) that of clinicians. Only 31 of 81 studies (38%) stated that further prospective studies or trials were required. Conclusions Few prospective deep learning studies and randomised trials exist in medical imaging. Most non-randomised trials are not prospective, are at high risk of bias, and deviate from existing reporting standards. Data and code availability are lacking in most studies, and human comparator groups are often small. Future studies should diminish risk of bias, enhance real world clinical relevance, improve reporting and transparency, and appropriately temper conclusions. Study registration PROSPERO CRD42019123605.
Article
Full-text available
Accurate assessment of cardiac function is crucial for the diagnosis of cardiovascular disease¹, screening for cardiotoxicity² and decisions regarding the clinical management of patients with a critical illness³. However, human assessment of cardiac function focuses on a limited sampling of cardiac cycles and has considerable inter-observer variability despite years of training4,5. Here, to overcome this challenge, we present a video-based deep learning algorithm—EchoNet-Dynamic—that surpasses the performance of human experts in the critical tasks of segmenting the left ventricle, estimating ejection fraction and assessing cardiomyopathy. Trained on echocardiogram videos, our model accurately segments the left ventricle with a Dice similarity coefficient of 0.92, predicts ejection fraction with a mean absolute error of 4.1% and reliably classifies heart failure with reduced ejection fraction (area under the curve of 0.97). In an external dataset from another healthcare system, EchoNet-Dynamic predicts the ejection fraction with a mean absolute error of 6.0% and classifies heart failure with reduced ejection fraction with an area under the curve of 0.96. Prospective evaluation with repeated human measurements confirms that the model has variance that is comparable to or less than that of human experts. By leveraging information across multiple cardiac cycles, our model can rapidly identify subtle changes in ejection fraction, is more reproducible than human evaluation and lays the foundation for precise diagnosis of cardiovascular disease in real time. As a resource to promote further innovation, we also make publicly available a large dataset of 10,030 annotated echocardiogram videos.
Article
Full-text available
Echocardiography uses ultrasound technology to capture high temporal and spatial resolution images of the heart and surrounding structures, and is the most common imaging modality in cardiovascular medicine. Using convolutional neural networks on a large new dataset, we show that deep learning applied to echocardiography can identify local cardiac structures, estimate cardiac function, and predict systemic phenotypes that modify cardiovascular risk but not readily identifiable to human interpretation. Our deep learning model, EchoNet, accurately identified the presence of pacemaker leads (AUC = 0.89), enlarged left atrium (AUC = 0.86), left ventricular hypertrophy (AUC = 0.75), left ventricular end systolic and diastolic volumes (R2 = 0.74 and R2 = 0.70), and ejection fraction (R2 = 0.50), as well as predicted systemic phenotypes of age (R2 = 0.46), sex (AUC = 0.88), weight (R2 = 0.56), and height (R2 = 0.33). Interpretation analysis validates that EchoNet shows appropriate attention to key cardiac structures when performing human-explainable tasks and highlights hypothesis-generating regions of interest when predicting systemic phenotypes difficult for human interpretation. Machine learning on echocardiography images can streamline repetitive tasks in the clinical workflow, provide preliminary interpretation in areas with insufficient qualified cardiologists, and predict phenotypes challenging for human evaluation.
Article
Full-text available
Gestational diabetes mellitus (GDM) poses increased risk of short- and long-term complications for mother and offspring1–4. GDM is typically diagnosed at 24–28 weeks of gestation, but earlier detection is desirable as this may prevent or considerably reduce the risk of adverse pregnancy outcomes5,6. Here we used a machine-learning approach to predict GDM on retrospective data of 588,622 pregnancies in Israel for which comprehensive electronic health records were available. Our models predict GDM with high accuracy even at pregnancy initiation (area under the receiver operating curve (auROC) = 0.85), substantially outperforming a baseline risk score (auROC = 0.68). We validated our results on both a future validation set and a geographical validation set from the most populated city in Israel, Jerusalem, thereby emulating real-world performance. Interrogating our model, we uncovered previously unreported risk factors, including results of previous pregnancy glucose challenge tests. Finally, we devised a simpler model based on just nine questions that a patient could answer, with only a modest reduction in accuracy (auROC = 0.80). Overall, our models may allow early-stage intervention in high-risk women, as well as a cost-effective screening approach that could avoid the need for glucose tolerance tests by identifying low-risk women. Future prospective studies and studies on additional populations are needed to assess the real-world clinical utility of the model. Leveraging the availability of nationwide electronic health records from over 500,000 pregnancies in Israel, a machine-learning approach offers an alternative means of predicting gestational diabetes at high accuracy in the early stages of pregnancy.
Article
Full-text available
Machine learning (ML) has become an essential asset for the life sciences and medicine. We selected 250 articles describing ML applications from 17 journals sampling 26 different fields between 2011 and 2016. Independent evaluation by two readers highlighted three results. First, only half of the articles shared software, 64% shared data and 81% applied any kind of evaluation. Although crucial for ensuring the validity of ML applications, these aspects were met more by publications in lower-ranked journals. Second, the authors’ scientific backgrounds highly influenced how technical aspects were addressed: reproducibility and computational evaluation methods were more prominent with computational co-authors; experimental proofs more with experimentalists. Third, 73% of the ML applications resulted from interdisciplinary collaborations comprising authors from at least two of the three disciplines: computational sciences, biology, and medicine. The results suggested collaborations between computational and experimental scientists to generate more scientifically sound and impactful work integrating knowledge from both domains. Although scientifically more valid solutions and collaborations involving diverse expertise did not correlate with impact factors, such collaborations provide opportunities to both sides: computational scientists are given access to novel and challenging real-world biological data, increasing the scientific impact of their research, and experimentalists benefit from more in-depth computational analyses improving the technical correctness of work. Applications of machine learning in the life sciences and medicine require expertise in computational methods and in scientific subject matter. The authors surveyed articles in the life sciences that included machine learning applications, and found that interdisciplinary collaborations increased the scientific validity of published research.
Article
Full-text available
Importance Social and economic costs of depression are exacerbated by prolonged periods spent identifying treatments that would be effective for a particular patient. Thus, a tool that reliably predicts an individual patient’s response to treatment could significantly reduce the burden of depression. Objective To estimate how accurately an outcome of escitalopram treatment can be predicted from electroencephalographic (EEG) data on patients with depression. Design, Setting, and Participants This prognostic study used a support vector machine classifier to predict treatment outcome using data from the first Canadian Biomarker Integration Network in Depression (CAN-BIND-1) study. The CAN-BIND-1 study comprised 180 patients (aged 18-60 years) diagnosed with major depressive disorder who had completed 8 weeks of treatment. Of this group, 122 patients had EEG data recorded before the treatment; 115 also had EEG data recorded after the first 2 weeks of treatment. Interventions All participants completed 8 weeks of open-label escitalopram (10-20 mg) treatment. Main Outcomes and Measures The ability of EEG data to predict treatment outcome, measured as accuracy, specificity, and sensitivity of the classifier at baseline and after the first 2 weeks of treatment. The treatment outcome was defined in terms of change in symptom severity, measured by the Montgomery-Åsberg Depression Rating Scale, before and after 8 weeks of treatment. A patient was designated as a responder if the Montgomery-Åsberg Depression Rating Scale score decreased by at least 50% during the 8 weeks and as a nonresponder if the score decrease was less than 50%. Results Of the 122 participants who completed a baseline EEG recording (mean [SD] age, 36.3 [12.7] years; 76 [62.3%] female), the classifier was able to identify responders with an estimated accuracy of 79.2% (sensitivity, 67.3%; specificity, 91.0%) when using only the baseline EEG data. For a subset of 115 participants who had additional EEG data recorded after the first 2 weeks of treatment, use of these data increased the accuracy to 82.4% (sensitivity, 79.2%; specificity, 85.5%). Conclusions and Relevance These findings demonstrate the potential utility of EEG as a treatment planning tool for escitalopram therapy. Further development of the classification tools presented in this study holds the promise of expediting the search for optimal treatment for each patient.
Article
A recent analysis highlighting the potential for algorithms to perpetuate existing racial biases in healthcare underscores the importance of thinking carefully about the labels used during algorithm development.
Article
Background: An increasing volume of prostate biopsies and a worldwide shortage of urological pathologists puts a strain on pathology departments. Additionally, the high intra-observer and inter-observer variability in grading can result in overtreatment and undertreatment of prostate cancer. To alleviate these problems, we aimed to develop an artificial intelligence (AI) system with clinically acceptable accuracy for prostate cancer detection, localisation, and Gleason grading. Methods: We digitised 6682 slides from needle core biopsies from 976 randomly selected participants aged 50-69 in the Swedish prospective and population-based STHLM3 diagnostic study done between May 28, 2012, and Dec 30, 2014 (ISRCTN84445406), and another 271 from 93 men from outside the study. The resulting images were used to train deep neural networks for assessment of prostate biopsies. The networks were evaluated by predicting the presence, extent, and Gleason grade of malignant tissue for an independent test dataset comprising 1631 biopsies from 246 men from STHLM3 and an external validation dataset of 330 biopsies from 73 men. We also evaluated grading performance on 87 biopsies individually graded by 23 experienced urological pathologists from the International Society of Urological Pathology. We assessed discriminatory performance by receiver operating characteristics and tumour extent predictions by correlating predicted cancer length against measurements by the reporting pathologist. We quantified the concordance between grades assigned by the AI system and the expert urological pathologists using Cohen's kappa. Findings: The AI achieved an area under the receiver operating characteristics curve of 0·997 (95% CI 0·994-0·999) for distinguishing between benign (n=910) and malignant (n=721) biopsy cores on the independent test dataset and 0·986 (0·972-0·996) on the external validation dataset (benign n=108, malignant n=222). The correlation between cancer length predicted by the AI and assigned by the reporting pathologist was 0·96 (95% CI 0·95-0·97) for the independent test dataset and 0·87 (0·84-0·90) for the external validation dataset. For assigning Gleason grades, the AI achieved a mean pairwise kappa of 0·62, which was within the range of the corresponding values for the expert pathologists (0·60-0·73). Interpretation: An AI system can be trained to detect and grade cancer in prostate needle biopsy samples at a ranking comparable to that of international experts in prostate pathology. Clinical application could reduce pathology workload by reducing the assessment of benign biopsies and by automating the task of measuring cancer length in positive biopsy cores. An AI system with expert-level grading performance might contribute a second opinion, aid in standardising grading, and provide pathology expertise in parts of the world where it does not exist. Funding: Swedish Research Council, Swedish Cancer Society, Swedish eScience Research Center, EIT Health.
Article
Objectives: To evaluate the feasibility of classifying sonographic images of fetal brain images taken in standard axial planes as normal or abnormal, using deep learning algorithms. Methods: A total of 92748 prenatal examinations were used in the study. After inclusion and exclusion, 10251 normal and 2529 abnormal pregnancies were included. Abnormal cases were confirmed by neonatal ultrasound, follow-up examination or autopsy. After a series of data pretraining processes, 15372 normal and 14047 abnormal fetal brain images were included. They were divided into training and test datasets (on a case level rather than on an image level), at a ratio of approximately 8:2. Training data were used to train the algorithms to classify images as normal or abnormal, and the accuracy was then tested on the test datasets. The algorithms were trained for three purposes: image segmentation along fetal skull, classifying the image and localizing the lesion. Performance of segmentation was assessed using precision, recall, and Dice's coefficient (DICE), calculated to measure the extent of overlap between human-labeled and machine-segmented regions. Sensitivity and specificity were calculated for classification accuracy assessment. Additionally, for abnormal images, how well a lesion was localized was determined. Results: Segmentation precision, recall and DICE were 97.9%, 90.9% and 94.1%, respectively. For classification the overall accuracy was 96.3%. The sensitivity and specificity for abnormal images were 96.9% and 95.9%, respectively. The area under the receiver operating characteristic curve was 0.989 (95% CI: 0.986-0.991). For 2491 abnormal fetal brain images, the lesions were precisely, closely and irrelevantly located in 61.6% (1535/2491), 24.7% (614/2491) and 13.7% (342/2491), respectively. Conclusions: Deep learning algorithms could be trained for segmentation and classification of normal and abnormal images and provide heat maps for lesion localization. This study laid a foundation for further research on the differential diagnosis of fetal intracranial abnormalities. This article is protected by copyright. All rights reserved.
Article
The Apple Heart Study demonstrates that the Apple Watch can detect atrial fibrillation inferred from the smartwatch heart-rate sensor with a high positive predictive value. However, we must now contend with many clinically relevant unknowns that were not addressed by the study, such as the ramifications of a false-positive result.