ArticlePDF Available

Abstract

Breast cancer is a significant global health challenge that affects both men and women, leading to cause-specific deaths. Current early screening interventions, such as digital mammography (DM), are susceptible to high false-positives and false-negatives. This paper explores the potential of convolutional neural network (CNN), a form of artificial intelligence (AI), to support screening mammography with the aim to enhance accuracy in lesion detection, image classification and diagnostic prediction. Because the adoption of AI in cancer diagnosis is still in its infancy, the objective of this paper is to provide insight into the benefits and limitations of deep learning-based approaches to detect and diagnose cancer. An analysis of the implementation of CNN in AI-screening mammography models was conducted, using the SWOT strategic analysis tool. Internal strengths that improve the predictive accuracy of CNN include transfer learning and data augmentation, whereas the internal weaknesses include a lack of data standardisation and reproducibility. External opportunities consist of increased sensitivity in differentiating between microcalcifications and non-tumorous structures, improved predictive diagnosis and reduced workload. Nevertheless, integration within clinical settings must also consider the external threats of breaching patient privacy, automation biases and the role of clinical judgement.
JCMC/ Vol 14/ No. 1/ Issue 47/ Jan-Mar, 2024 
ISSN 2091-2889 (Online) ISSN 2091-2412 (Print)
ABSTRACT
Breast cancer is a signicant global health challenge that aects both men and women, leading
to cause-specic deaths. Current early screening intervenons, such as digital mammography
(DM), are suscepble to high false-posives and false-negaves. This paper explores the potenal
of convoluonal neural network (CNN), a form of arcial intelligence (AI), to support screening
mammography with the aim to enhance accuracy in lesion detecon, image classicaon and
diagnosc predicon. Because the adopon of AI in cancer diagnosis is sll in its infancy, the
objecve of this paper is to provide insight into the benets and limitaons of deep learning-
based approaches to detect and diagnose cancer. An analysis of the implementaon of CNN in
AI-screening mammography models was conducted, using the SWOT strategic analysis tool.
Internal strengths that improve the predicve accuracy of CNN include transfer learning and
data augmentaon, whereas the internal weaknesses include a lack of data standardisaon and
reproducibility. External opportunies consist of increased sensivity in dierenang between
microcalcicaons and non-tumorous structures, improved predicve diagnosis and reduced
workload. Nevertheless, integraon within clinical sengs must also consider the external threats
of breaching paent privacy, automaon biases and the role of clinical judgement.
Journal of Chitwan Medical College 2024;14(47):89-94
Available online at: www.jcmc.com.np
REVIEW ARTICLE
A SWOT ANALYSIS OF BREAST CANCER DIAGNOSIS IN DIGITAL MAMMOGRAPHY USING DEEP
CONVOLUTIONAL NEURAL NETWORK
Elizabeth Yong1, Yen Nee Teo2, Leanne McKnoulty3, Ajeevan Gautam4, Rajib Chaulagain5, Kun Hing Yong4,*
1Indooroopilly State High School, Brisbane, Queensland 4068, Australia
2Instute of Malaysian and Internaonal Studies, Naonal University of Malaysia, Bangi 43600, Selangor, Malaysia
3GUPSA editor in residence, Grith University, Nathan, Queensland 4111, Australia
4School of Medicine and Denstry, Grith University, Nathan, Queensland 4111, Australia
5Department of Oral Pathology, Chitwan Medical College, Bharatpur, Chitwan 44207, Nepal
ISSN 2091-2889 (Online) ISSN 2091-2412 (Print)
E
S
T
D
2
0
1
0
J
O
U
R
N
A
L
O
F
C
H
I
T
W
A
N
M
E
D
I
C
A
L
C
O
L
L
E
G
E
Received: 24 Jan, 2024
Accepted: 11 Mar, 2024
Published: 30 Mar, 2024
Key words: Arcial intelligence; Breast
cancer; Convoluonal neural network; Digital
mammography; SWOT analysis.
*Correspondence to: Kun Hing Yong, School of
Medicine and Denstry, Grith University, 170 Kessels
Rd, Nathan Queensland 4111, Australia.
Email:

DOI:


Yong E, Teo YN, McKnoulty L, Gautam A, Chaulagain
R, Yong KH. HA SWOT analysis of breast cancer
diagnosis in digital mammography using deep
convoluonal neural network. Journal of Chitwan
Medic al Colle ge.2024;14(47):89-94.
JCMC
INTRODUCTION
Breast cancer is presently the most commonly diagnosed cancer
in women globally, and the second-leading cause of mortality
from cancer. Accurate cancer diagnosis of symptomac
paents at an early stage is pernent to improve cancer
outcomes, thereby reducing cause-specic deaths. In the
early 1990s in Australia, screening mammography programs
were implemented for the early detecon and treatment of
breast cancer, but their accuracy in sensivity and specicity
remains error-prone, leading to the reporng of false-posives
and false-negaves, respecvely. The addional imaging tests
and biopsies ensuing false-posive recalls can contribute to
unnecessary emoonal stress for the paent. Similarly, health
hazards from high radiaon exposure should also be considered.
Errors in interpretaon and detecon of abnormalies can
be aributed to dierent breast densies, small tumours or
arfacts.1 Another limitaon is subjecvity in image analysis
due to varied percepons across interpreters, known as inter-
reader variability. During double reading to improve diagnosc
accuracy, two radiologists independently read the same
screening mammography.2 Despite image analysis performed
manually by experts, factors such as fague and decreased
aenon can adversely aect the results ndings. Furthermore,
double reading is labour intensive, implying that the me
constraints on clinical evaluaons and examinaons can lead
to a delegaon of tasks from radiologists to other physicians
or breast clinicians. This can lead to unfavourable outcomes
for the paents, being subjected to higher posive recall rates
and false-posive interpretaons, because physicians may lack
in sucient radiological knowledge to exert accurate clinical
judgement.3
With rapid development in compung power and data, AI has
been increasingly integrated in clinical sengs. Among them
is machine learning and, in parcular, deep learning with
CNN diagnosc-based approaches, whereby the technology is
trained to recognise complex paerns from raw input with its
mul-layered networks and make accurate connecons based
on the context. Its ulity in lesion detecon, image classicaon
and diagnosc predicon enable addional aid to radiologists
to achieve higher accuracy when interpreng DM, thereby
serving as a prospecve applicaon to improve diagnosis
of breast cancer.4 The applicaons of these technological
innovaons have understandably raised concerns among
healthcare professionals, in regard to its feasibility and
JCMC/ Vol 14/ No. 1/ Issue 47/ Jan-Mar, 2024

ISSN 2091-2889 (Online) ISSN 2091-2412 (Print)
diagnosc ecacy. To address the concerns of AI applicaons
in medical imaging, an understanding of the benets and
limitaons of AI tools is necessary.5
METHODS
A literature review of research published during the last 5
years was conducted to evaluate the strengths, weaknesses,
opportunies and threats (SWOT) of CNN in AI models used
to diagnose breast cancer. A brief analysis is provided, while
the primary points are outlined in Table 1. This SWOT analysis
forms the basis for governmental decision-makers and health
care providers to understand the potenal implementaon of
AI within clinical sengs, and to consider future improvements
in approaching the problem.
Secon 2 introduces the funconality of CNN, Secon 3
elucidates the strengths of applying CNN in mammography
to diagnose breast cancer, Secon 4 explains the external
opportunies, and Secons 5 and 6 discuss the weaknesses
and threats or ethical challenges. Finally, Secon 7 presents
the suggested future direcons and the conclusion.

In deep learning, a CNN is a class of deep neural network that
uses algorithms to process a large quanty of data with a grid
paern, notably in image-related analysis.6 CNN is employed
for image examinaon, idencaon or classicaon because
it can eciently extract features from images and simplify
them for beer analysis. It consists of three disnct layers
with funcons that interconnect each other, namely an input
layer, mulple hidden layers and an output layer. The inial
DM image undergoes ltering in the rst convoluonal layer,
which enhances the features, removes unwanted noise, and
helps to dierenate the edges and shapes of the region
under invesgaon. Subsequent convoluonal layers enhance
the feature paerns to facilitate idencaon of tumour
contour and enable the extracon of specic features, such as
structural paerns or dominant outliers in the image, making
CNN highly ecient for image processing.7 The pooling layer
lters the minimum, maximum, mean or median of the set of
pixels within the image that fall within the lter, to reduce the
spaal size and maintain only the most crucial informaon.8
Decreasing the parameters increases the processing speed.
The informaon is subsequently passed through the fully
connected layer, where extracon of inputs from feature
analysis and applicaon of weights and predicts the output
into classes of cancer. For example, in the study by Ragab and
colleagues9, the fully connected layer classied abnormal areas
as benign or malignant, while various other studies classied
regions as benign, malignant or without tumour. Figure 1
depicts the structure of a classic CNN architecture.
DISCUSSION

Transfer learning
Transfer learning refers to leveraging the learned features of
a pre-trained model as the foundaon for training a model to
perform a new task. It takes advantage of the fact that neural
networks trained on large databases of images, such as those
with ImageNet, have learned and established parameters in
the early layers relevant to numerous visual tasks, despite the
specic task they are programmed to perform.10 Salehi and
colleagues explained that certain funcons of CNNs in lower
layers, such as those dedicated for edge, texture and paern
detecons, can be calibrated and applied to higher layers of
the network.10 However, the specic features that must be
learned will increase in complexity where, for instance, the
output layer would only respond to images of a specic tumour
that it had been trained to detect. Thus, using a pre-trained
model and customising the new model with addional new
layers and adjustments to the number of neurons or classes
depending on the specic task requirements has the benet of
minimising training me and requires limited data. This means
earlier models can be rened and adapted to various tasks,
including detecng and classifying lesions, without retraining
a deep neural network from scratch.
      
       
pooling layer, and fully connected layer
Note. The nal output is classied as normal, benign or
malignant.

In medical imaging where the number of fully annotated
mammograms available is limited, training a deep learning
model with data augmentaon ensures improvement to the
models while also minimising data overng. Overng is
a stascal error whereby the model ts too closely to the
trained dataset and cannot be generalised to new data.11
Data augmentaon enables arcial expansion on exisng
datasets to generate modied copies and, hence, introduces
a vast variety of paerns that the model can recognise and
learn from. Improvement to data variability is demonstrated
to enhance the predicve accuracy of the AI models in
detecng suspicious regions of interest when presented with
normal and abnormal DMs.12,13 This provides the radiologist
with psychological support, by reducing the cognive burden
associated with idenfying potenal lesion regions.
For example, GAN-based augmentaon, an unsupervised deep
learning method that extracts hidden properes from data to
formulate its decision-making process, has shown potenal
JCMC/ Vol 14/ No. 1/ Issue 47/ Jan-Mar, 2024 
ISSN 2091-2889 (Online) ISSN 2091-2412 (Print)
to improve accuracy in mass classicaon aer geometric
transformaons from unrelated masses or increase in noise
distorons.12 As such, it has also been a widely used approach
in breast mass detecon and mass segmentaon.13 As the
use of data augmentaon methods expands, it is pernent to
evaluate the quality of the output and recognise that building
upon minimal databases can restrict the generalisaon
ability of the model and potenally reinforce inherent biases.


With higher resoluon DM images, convenonal computer-
aided diagnosis (CAD) models can disnguish between benign
and malignant lesions by assessing their greyscale levels,
homogeneity, gradient, paerns and shape.14 However,
because dense breast ssue appears white and has similar
shade and intensity values as tumorous regions containing
microcalcicaons, dense breast ssue, with relavely high
amounts of glandular ssue and brous connecve ssue,
can hide lesions and is prone to misdiagnosis and reporng of
false negaves. With AI screening, it can perform detecon of
potenally tumorous region and compare its intensity value
with other regions of the breast followed by segmentaon
of the tumour area surrounded with malignant ssues.15
This can reduce the lower sensivity from human perceptual
error, because it separates pixels of cancer region from normal
region. Geras et al. showed that the addion of the deep
learning method, which learns the intermediate and abstract
representaons of the data, can improve accuracy in lesion
classicaon in DMs, reaching similar sensivity to radiologists’
assessment.14

Given the large processing capacity of AI, its capability of
analysing and processing data from wide-ranging sources,
including medical images, laboratory test results and paent
history, enables idencaon of paerns and abnormalies
that may otherwise be missed by human experts. Missed
microcalcicaons can be aributed to their small size or
concealment by overlying high amounts of brous and glandular
ssues.16 Therefore, implemenng AI in mammography has the
potenal to increase sensivity in dierenang between the
microcalcicaons and non-tumorous anatomic structures,
such as increased breast density. It employs image processing
techniques to spaally lter the DM and improve signal-
to-noise rao, yielding higher sensivity for detecng true
abnormalies.15 In a study by Kim et al., the classicaon
performance of AI-CAD demonstrated a higher accuracy value
of 0.938–0.970 compared to an accuracy value of 0.810–0.881
achieved by radiologists.17 Findings by Liu and colleagues
also reported that combining the deep learning model into
mammography aained similar diagnosc performance to
that of an experienced radiologist, and signicantly surpassed
the performance of a junior radiologist (p=0.029; p<0.05).18
The improvement indicated promising results in reducing the
quanty of unnecessary biopsies performed, showing potenal
for early detecon and intervenon of breast cancer.
Reduced workload
Numerous European countries have employed double reading
with arbitraon, whereas the United States typically has
employed single reading with CAD.19 While standard double
reading has been shown to reduce recall rates, it is labour
intensive. A study by Dembrower and colleagues compared
the cancer detecon rates and eciency of varying methods
of interpretaon: single reading by AI, double reading by two
radiologists, double reading by one radiologist and AI and triple
reading by two radiologists and AI.20 The ndings suggested
that the performance for triple reading (95% CI 1.04–1.11)
outperformed the double reading by one radiologist and AI or
by two radiologists (95% CI 1.00–1.09). Triple reading increased
recalls by 5% and consensus discussion by 50%, while double
reading by one radiologist and AI decreased recalls by 4%
with a reasonable number of consensus discussion. In triple
reading, the percepon of the combined radiologists was
favoured over the percepon of the AI, indicang that the
ability of AI in detecng cancer was under-esmated rather
than over-esmated, explaining the slightly higher recall
rates. Because the higher abnormal interpretaon rate for AI
and one radiologist did not translate into an increased recall
rate, it would help reduce workload me, which had been
demonstrated to be by nearly 40%.20,21 Replacement of the
second reader with AI would substanally reduce the me
radiologists spend reading mammograms. Another study
by Lång and colleagues determined that mammography
screening supported with AI yielded similar cancer detecon
rate as standard double reading, with the recall rate being 0.2%
higher at 2.2% , suggesng that the use of AI in mammography
can be considered.22
Weaknesses

Standardisaon within a clinical seng can help improve
interoperability and vast exchange of health data and
informaon. This is pernent to improve performance of
the models in imaging acquision and processing, because
the quality of image acquision aects radiomic feature
calculaons, radiomics being the extensive image-based
phenotyping of abnormalies through extracon of diverse
feature values from medical images.23 Currently, insucient
standardizaon is evident in the collecon and storage of
unstructured data, as well as in the process of unifying data
that represents a single healthcare system.24 Substanal
informaon technology and systems resources is required
to implement this, and the feasibility remains under acve
invesgaon.
One method proposes using paent-reported outcomes (PROs)
and validated quesonnaires, as they are valuable survival
indicators that can benet cancer care delivery, research
and clinical operaons.25 Nonetheless, several limitaons are
JCMC/ Vol 14/ No. 1/ Issue 47/ Jan-Mar, 2024

ISSN 2091-2889 (Online) ISSN 2091-2412 (Print)

Strengths
Transfer learning:
- Minimises training me and requires limited data through modifying a pre-trained model, tailored
to suit specic requirements
Data augmentaon:
- Minimises data overng
- Improves generalisability, image recognion, segmentaon accuracy and analysis
- Enhances predicve accuracy in tumour classicaon
Weaknesses
- Lack of standardisaon limits interoperability
- Limitaons in obtaining and implemenng paent-reported outcomes
- Data reproducibility is subjected to data dri
- Lack of high-quality and mul-instuonal datasets may reduce generalisability
- Dierent mechanisms performed at each CNN layer require varying levels of complexity during
programming
Opportunies
Pixel-level image classicaon:
- Reduces false posives due to ability to discern between tumorous and non-tumorous regions
- Improves cancer detecon rates
Improved paent value through predicve diagnosis:
- Increases sensivity for detecon of true abnormalies
- Improves tumour classicaon accuracy
- Potenal to reduce quanty of unnecessary biopsies and medical costs.
Opportunies for healthcare professionals:
- Enhances reading eciency by reducing number of tests requiring radiologist interpretaon
- Reduces workload
Threats
Paent privacy:
- Breach of health and personal informaon
- Lack of transparency
Algorithmic biases:
- Biases in input training data can produce skewed results and exacerbate health care inequality
- Ethical concerns regarding the role of AI in clinical judgement
Role of human judgement:
- Medico-legal responsibility for healthcare providers if incorrect diagnosis is made
- Potenal discordance between clinical pracces and AI suggesons
- Impairment in clinical judgement from over-reliance on AI technology, resulng in potenal pa-
ent injury
- Jeopardizaon of the learning process and clinical reasoning abilies of medical students or novice
radiologists
present. These include paent-level barriers such as disability,
challenges in reading and responding to the quesonnaires or
with recalling their symptoms, clinical-level obstacles like lack
of sta training with interpreng and implemenng PROs into
clinical pracces, and service-level challenges like lack of PRO
data logging into electronic medical records within a hospital
seng.26
Data Reproducibility
Data reproducibility is limited when transferred across
healthcare systems and global communies, but even within
the training environment, data dri over me for AI algorithms
and advanced CDSS can aect their performance. This is a
result of variaons in distribuon, formang or quality of
data, awed data transformaon, absence of natural dri
when training the model or covariate shi.27 Thus, standards
must be incorporated to connuously monitor AI algorithms
and ensure their validity even if AI were to be successfully
implemented as a technological pracce in medicine due to
their evolving nature.
Threats regarding Ethical Challenges

Precision medical technology relies on extensive medical
informaon for cancer diagnosis, screening, data processing,
opmising care delivery and conducng clinical operaons.
To train models eecvely, medical researchers need access
to paents’ personal health records. However, concerns
arise regarding the potenal misuse of data, leading to issues
like identy the, insurance fraud and illegal acquision of
prescripon drugs. To ensure ethical use of paent data in
clinical pracce, medical researchers must be transparent about
how data will be used. Addionally, they should implement
robust safety measures to safeguard paent privacy and obtain
informed consent from individuals contribung their data.
Algorithmic Biases
JCMC/ Vol 14/ No. 1/ Issue 47/ Jan-Mar, 2024 
ISSN 2091-2889 (Online) ISSN 2091-2412 (Print)
REFERENCES:
1. Nori J, Gill MK, Vignoli C, Bicchierai G, De Benedeo D, Di Naro F, Vanzi E,
Boeri C, Miele V. Artefacts in contrast enhanced digital mammography:
how can they aect diagnosc image quality and confuse clinical
diagnosis? Insights into Imaging. 2020;11(1):16.[DOI]
2. Salim M, Dembrower K, Eklund M, Lindholm P, Strand F. Range of
Radiologist Performance in a Populaon-based Screening Cohort of 1
Million Digital Mammography Examinaons. Radiology. 2020;297(1):33-
9. [DOI]
3. Chen Y, James JJ, Michalopoulou E, Darker IT, Jenkins J. Performance
of Radiologists and Radiographers in Double Reading Mammograms:
The UK Naonal Health Service Breast Screening Program. Radiology.
2023;306(1):102-9. [DOI]
4. Do S, Song KD, Chung JW. Basics of Deep Learning: A Radiologist’s Guide
to Understanding Published Radiology Arcles on Deep Learning. Korean
J Radiol. 2020;21(1):33-41. Epub 2020/01/11. [DOI]
5. Teo YN, Yong KH, Gautam A, Chaulagain R. Guarding our future: Harnessing
arcial intelligence to combat anmicrobial resistance and raise public
awareness. Journal of Chitwan Medical College. 2023;13(3):1-2. [DOI]
6. Nasser M, Yusof UK. Deep Learning Based Methods for Breast Cancer
Diagnosis: A Systemac Review and Future Direcon. Diagnoscs.
2023;13(1):161. [DOI]
7. Albalawi U, Manimurugan S, Varatharajan R. Classicaon of breast cancer
mammogram images using convoluon neural network. Concurrency
and Computaon: Pracce and Experience. 2022;34(13):e5803. [DOI]
8. Zafar A, Aamir M, Mohd Nawi N, Arshad A, Riaz S, Alruban A, Dua AK,
Almotairi S. A Comparison of Pooling Methods for Convoluonal Neural
Networks. Applied Sciences. 2022;12(17):8643. [DOI]
9. Ragab DA, Sharkas M, Marshall S, Ren J. Breast cancer detecon using
deep convoluonal neural networks and support vector machines. PeerJ.
2019;7:e6201. Epub 2019/02/05. [DOI]
10. Salehi AW, Khan S, Gupta G, Alabduallah BI, Almjally A, Alsolai H, Siddiqui
T, Mellit A. A Study of CNN and Transfer Learning in Medical Imaging:
Advantages, Challenges, Future Scope. Sustainability. 2023;15(7):5930.
[DOI]
11. Ying X. An Overview of Overng and its Soluons. Journal of Physics:
Conference Series. 2019;1168(2):022022. [DOI]
12. Oza P, Sharma P, Patel S, Adedoyin F, Bruno A. Image Augmentaon
Techniques for Mammogram Analysis. Journal of Imaging. 2022;8(5):141.
[DOI]
13. Desai SD, Giraddi S, Verma N, Gupta P, Ramya S, editors. Breast Cancer
Detecon Using GAN for Limited Labeled Dataset. 2020 12th Internaonal
Conference on Computaonal Intelligence and Communicaon Networks
(CICN); 2020 25-26 Sept. 2020. [DOI]
14. Geras KJ, Mann RM, Moy L. Arcial Intelligence for Mammography and
Digital Breast Tomosynthesis: Current Concepts and Future Perspecves.
Radiology. 2019;293(2):246-59. Epub 2019/09/25. [DOI]
15. Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W.
Deep Learning to Improve Breast Cancer Detecon on Screening
Mammography. Sci Rep. 2019;9(1):12495. Epub 2019/08/31. [DOI]
16. Kressin NR, Wormwood JB, Baaglia TA, Maschke AD, Slanetz PJ, Pankowska
M, Gunn CM. Women’s Understandings and Misunderstandings of Breast
Density and Related Concepts: A Mixed Methods Study. J Womens Health
(Larchmt). 2022;31(7):983-90. Epub 2022/03/02. [DOI]
17. Kim H-E, Kim HH, Han B-K, Kim KH, Han K, Nam H, Lee EH, Kim E-K.
Changes in cancer detecon and false-posive recall in mammography
using arcial intelligence: a retrospecve, mulreader study. The Lancet
Digital Health. 2020;2(3):e138-e48. [DOI]
18. Liu H, Chen Y, Zhang Y, Wang L, Luo R, Wu H, Wu C, Zhang H, Tan W,
Yin H, Wang D. A deep learning model integrang mammography
and clinical factors facilitates the malignancy predicon of BI-RADS 4
microcalcicaons in breast cancer screening. European Radiology.
2021;31(8):5902-12. [DOI]
CONCLUSION
Current evidence regarding the integraon of AI in clinical
sengs has shown promising results in that AI-supported
screening mammography improves cancer detecon rates or
is level with senior radiologists, while also enhancing paent
outcomes and alleviang radiologists’ workload. A main
advantage is its enhanced sensivity in discerning between
benign and malignant lesions from dense breast ssues,
a challenging diagnosis, thereby minimising perceptual
errors. This can improve accuracy in diagnosc performance
and facilitate predicve diagnosis for early intervenon.
Nevertheless, the availability of well-curated datasets to
ensure high-quality result outcomes by AI systems that enable
sucient, reliable data generalisaon and cancer detecon
is yet to be assured. As the results of this paper showcase,
considerable risks could emerge that impact the accuracy of the
data and, if not migated, would aect the paent safety. These
incorporate ethical issues around medical responsibility for any
diagnosc errors made, human oversight and transparency.
Thus, investment to support clinical trials in researching and
evaluang the outcomes and performance of AI algorithms
on the paent and providers regarding breast screening
mammography is encouraged, to validate the ecacy, validity
and reliability when applied as roune clinical pracce.
Bias within AI algorithms is aected by the bias within the data
they are trained on. If a dataset is biased towards a parcular
demographic group, the validity in the AI-generated results
to predict the cancer outcomes of individuals from other
demographic groups is reduced either over-represenng
or under-represenng certain populaons. To prevent
perpetuaon of inequalies in healthcare by AI algorithms
that may contribute to potenal harm, diverse and more
representave range of datasets should be used instead, while
inherent biases should undergo careful invesgaon to ensure
they are not overlooked.
Role of Human Judgement
Although radiologists are blinded to the output of the AI system
to prevent double reading or over-reliance on AI, the validity of
the consensus decision may be inuenced depending on the
under- or over-esmaon of the accuracy of AI systems.19 This
will result in variaons in recall rates and cancer detecon. A
strength may be a reducon in recall rates by introducing higher
specicity by experts to migate the higher cancer detecon
rates of AI. However, over-reliance on AI could lead clinicians to
overlook their crical clinical judgement, irrespecve of their
experience. As such, it can increase the risks of accountability
when performing incorrect diagnosis, as recommended by AI,
which results in avoidable harm to paents.
JCMC/ Vol 14/ No. 1/ Issue 47/ Jan-Mar, 2024

ISSN 2091-2889 (Online) ISSN 2091-2412 (Print)
19. Taylor-Phillips S, Snton C. Double reading in breast cancer screening:
consideraons for policy-making. Br J Radiol. 2020;93(1106):20190610.
Epub 2019/10/17. [DOI]
20. Dembrower K, Crippa A, Colón E, Eklund M, Strand F. Arcial intelligence
for breast cancer detecon in screening mammography in Sweden: a
prospecve, populaon-based, paired-reader, non-inferiority study. The
Lancet Digital Health. 2023;5(10):e703-e11. [DOI]
21. Rodriguez-Ruiz A, Lång K, Gubern-Merida A, Teuwen J, Broeders M,
Gennaro G, et al. Can we reduce the workload of mammographic
screening by automac idencaon of normal exams with arcial
intelligence? A feasibility study. Eur Radiol. 2019;29(9):4825-32. Epub
2019/04/18. [DOI]
22. Lång K, Josefsson V, Larsson A-M, Larsson S, Högberg C, Sartor H, et al.
Arcial intelligence-supported screen reading versus standard double
reading in the Mammography Screening with Arcial Intelligence trial
(MASAI): a clinical safety analysis of a randomised, controlled, non-
inferiority, single-blinded, screening accuracy study. The Lancet Oncology.
2023;24(8):936-44. [DOI]
23. Li XT, Huang RY. Standardizaon of imaging methods for machine learning
in neuro-oncology. Neurooncol Adv. 2020;2(Suppl 4):iv49-iv55. Epub
2021/02/02. [DOI]
24. Sedlakova J, Daniore P, Horn Wintsch A, Wolf M, Stanikic M, Haag
C, Sieber C, Schneider G, Staub K, Alois Elin D, Grübner O, Rinaldi F,
von Wyl V. Challenges and best pracces for digital unstructured data
enrichment in health research: A systemac narrave review. PLOS Digit
Health. 2023;2(10):e0000347. Epub 2023/10/11. [DOI]
25. Camini C, Magliea G, Dioda F, Puntoni M, Marcomini B, Lazzarelli S,
Pinto C, Perrone F. The Eects of Paent-Reported Outcome Screening
on the Survival of People with Cancer: A Systemac Review and Meta-
Analysis. Cancers (Basel). 2022;14(21). Epub 2022/11/12. [DOI]
26. Nguyen H, Butow P, Dhillon H, Sundaresan P. A review of the barriers
to using Paent-Reported Outcomes (PROs) and Paent-Reported
Outcome Measures (PROMs) in roune cancer care. J Med Radiat Sci.
2021;68(2):186-95. Epub 2020/08/21. [DOI]
27. Jacob T. Shreve M, Sadia A. Khanani M, Tua C. Haddad M. Arcial
Intelligence in Oncology: Current Capabilies, Future Opportunies, and
Ethical Consideraons. American Society of Clinical Oncology Educaonal
Book. 2022(42):842-51. [DOI]
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This paper presents a comprehensive study of Convolutional Neural Networks (CNN) and transfer learning in the context of medical imaging. Medical imaging plays a critical role in the diagnosis and treatment of diseases, and CNN-based models have demonstrated significant improvements in image analysis and classification tasks. Transfer learning, which involves reusing pre-trained CNN models, has also shown promise in addressing challenges related to small datasets and limited computational resources. This paper reviews the advantages of CNN and transfer learning in medical imaging, including improved accuracy, reduced time and resource requirements, and the ability to address class imbalances. It also discusses challenges, such as the need for large and diverse datasets, and the limited interpretability of deep learning models. What factors contribute to the success of these networks? How are they fashioned, exactly? What motivated them to build the structures that they did? Finally, the paper presents current and future research directions and opportunities, including the development of specialized architectures and the exploration of new modalities and applications for medical imaging using CNN and transfer learning techniques. Overall, the paper highlights the significant potential of CNN and transfer learning in the field of medical imaging, while also acknowledging the need for continued research and development to overcome existing challenges and limitations.
Article
Full-text available
Breast cancer is one of the precarious conditions that affect women, and a substantive cure has not yet been discovered for it. With the advent of Artificial intelligence (AI), recently, deep learning techniques have been used effectively in breast cancer detection, facilitating early diagnosis and therefore increasing the chances of patients' survival. Compared to classical machine learning techniques, deep learning requires less human intervention for similar feature extraction. This study presents a systematic literature review on the deep learning-based methods for breast cancer detection that can guide practitioners and researchers in understanding the challenges and new trends in the field. Particularly, different deep learning-based methods for breast cancer detection are investigated, focusing on the genomics and histopathological imaging data. The study specifically adopts the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), which offer a detailed analysis and synthesis of the published articles. Several studies were searched and gathered, and after the eligibility screening and quality evaluation, 98 articles were identified. The results of the review indicated that the Convolutional Neural Network (CNN) is the most accurate and extensively used model for breast cancer detection, and the accuracy metrics are the most popular method used for performance evaluation. Moreover, datasets utilized for breast cancer detection and the evaluation metrics are also studied. Finally, the challenges and future research direction in breast cancer detection based on deep learning models are also investigated to help researchers and practitioners acquire in-depth knowledge of and insight into the area.
Article
Full-text available
This study examined the effects of the routine assessment of patient-reported outcomes (PROs) on the overall survival of adult patients with cancer. We included clinical trials and observational studies with a control group that compared PRO monitoring interventions in cancer clinical practice to usual care. The Cochrane risk-of-bias tools were used. In total, six studies were included in the systematic review: two randomized trials, one population-based retrospectively matched cohort study, two pre–post with historical control studies and one non-randomized controlled trial. Half were multicenter, two were conducted in Europe, three were conducted in the USA and was conducted in Canada. Two studies considered any type of cancer, two were restricted to lung cancer and two were restricted to advanced forms of cancer. PRO screening was electronic in four of the six studies. The meta-analysis included all six studies (intervention = 130.094; control = 129.903). The pooled mortality outcome at 1 year was RR = 0.77 (95%CI 0.76–0.78) as determined by the common effect model and RR = 0.82 (95%CI 0.60–1.12; p = 0.16) as determined by the random-effects model. Heterogeneity was statistically significant (I2 = 73%; p < 0.01). The overall risk of bias was rated as moderate in five studies and serious in one study. This meta-analysis seemed to indicate the survival benefits of PRO screening. As routine PRO monitoring is often challenging, more robust evidence regarding the effects of PROs on mortality would support systematic applications.
Article
Full-text available
Digital data play an increasingly important role in advancing health research and care. However, most digital data in healthcare are in an unstructured and often not readily accessible format for research. Unstructured data are often found in a format that lacks standardization and needs significant preprocessing and feature extraction efforts. This poses challenges when combining such data with other data sources to enhance the existing knowledge base, which we refer to as digital unstructured data enrichment. Overcoming these methodological challenges requires significant resources and may limit the ability to fully leverage their potential for advancing health research and, ultimately, prevention, and patient care delivery. While prevalent challenges associated with unstructured data use in health research are widely reported across literature, a comprehensive interdisciplinary summary of such challenges and possible solutions to facilitate their use in combination with structured data sources is missing. In this study, we report findings from a systematic narrative review on the seven most prevalent challenge areas connected with the digital unstructured data enrichment in the fields of cardiology, neurology and mental health, along with possible solutions to address these challenges. Based on these findings, we developed a checklist that follows the standard data flow in health research studies. This checklist aims to provide initial systematic guidance to inform early planning and feasibility assessments for health research studies aiming combining unstructured data with existing data sources. Overall, the generality of reported unstructured data enrichment methods in the studies included in this review call for more systematic reporting of such methods to achieve greater reproducibility in future studies.
Article
Full-text available
Research in the medical imaging field using deep learning approaches has become progressively contingent. Scientific findings reveal that supervised deep learning methods’ performance heavily depends on training set size, which expert radiologists must manually annotate. The latter is quite a tiring and time-consuming task. Therefore, most of the freely accessible biomedical image datasets are small-sized. Furthermore, it is challenging to have big-sized medical image datasets due to privacy and legal issues. Consequently, not a small number of supervised deep learning models are prone to overfitting and cannot produce generalized output. One of the most popular methods to mitigate the issue above goes under the name of data augmentation. This technique helps increase training set size by utilizing various transformations and has been publicized to improve the model performance when tested on new data. This article surveyed different data augmentation techniques employed on mammogram images. The article aims to provide insights into basic and deep learning-based augmentation techniques.
Article
Background: Artificial intelligence (AI) as an independent reader of screening mammograms has shown promise, but there are few prospective studies. Our aim was to conduct a prospective clinical trial to examine how AI affects cancer detection and false positive findings in a real-world setting. Methods: ScreenTrustCAD was a prospective, population-based, paired-reader, non-inferiority study done at the Capio Sankt Göran Hospital in Stockholm, Sweden. Consecutive women without breast implants aged 40-74 years participating in population-based screening in the geographical uptake area of the study hospital were included. The primary outcome was screen-detected breast cancer within 3 months of mammography, and the primary analysis was to assess non-inferiority (non-inferiority margin of 0·15 relative reduction in breast cancer diagnoses) of double reading by one radiologist plus AI compared with standard-of-care double reading by two radiologists. We also assessed single reading by AI alone and triple reading by two radiologists plus AI compared with standard-of-care double reading by two radiologists. This study is registered with ClinicalTrials.gov, NCT04778670. Findings: From April 1, 2021, to June 9, 2022, 58 344 women aged 40-74 years underwent regular mammography screening, of whom 55 581 were included in the study. 269 (0·5%) women were diagnosed with screen-detected breast cancer based on an initial positive read: double reading by one radiologist plus AI was non-inferior for cancer detection compared with double reading by two radiologists (261 [0·5%] vs 250 [0·4%] detected cases; relative proportion 1·04 [95% CI 1·00-1·09]). Single reading by AI (246 [0·4%] vs 250 [0·4%] detected cases; relative proportion 0·98 [0·93-1·04]) and triple reading by two radiologists plus AI (269 [0·5%] vs 250 [0·4%] detected cases; relative proportion 1·08 [1·04-1·11]) were also non-inferior to double reading by two radiologists. Interpretation: Replacing one radiologist with AI for independent reading of screening mammograms resulted in a 4% higher non-inferior cancer detection rate compared with radiologist double reading. Our study suggests that AI in the study setting has potential for controlled implementation, which would include risk management and real-world follow-up of performance. Funding: Swedish Research Council, Swedish Cancer Society, Region Stockholm, and Lunit.
Article
Background: Retrospective studies have shown promising results using artificial intelligence (AI) to improve mammography screening accuracy and reduce screen-reading workload; however, to our knowledge, a randomised trial has not yet been conducted. We aimed to assess the clinical safety of an AI-supported screen-reading protocol compared with standard screen reading by radiologists following mammography. Methods: In this randomised, controlled, population-based trial, women aged 40-80 years eligible for mammography screening (including general screening with 1·5-2-year intervals and annual screening for those with moderate hereditary risk of breast cancer or a history of breast cancer) at four screening sites in Sweden were informed about the study as part of the screening invitation. Those who did not opt out were randomly allocated (1:1) to AI-supported screening (intervention group) or standard double reading without AI (control group). Screening examinations were automatically randomised by the Picture Archive and Communications System with a pseudo-random number generator after image acquisition. The participants and the radiographers acquiring the screening examinations, but not the radiologists reading the screening examinations, were masked to study group allocation. The AI system (Transpara version 1.7.0) provided an examination-based malignancy risk score on a 10-level scale that was used to triage screening examinations to single reading (score 1-9) or double reading (score 10), with AI risk scores (for all examinations) and computer-aided detection marks (for examinations with risk score 8-10) available to the radiologists doing the screen reading. Here we report the prespecified clinical safety analysis, to be done after 80 000 women were enrolled, to assess the secondary outcome measures of early screening performance (cancer detection rate, recall rate, false positive rate, positive predictive value [PPV] of recall, and type of cancer detected [invasive or in situ]) and screen-reading workload. Analyses were done in the modified intention-to-treat population (ie, all women randomly assigned to a group with one complete screening examination, excluding women recalled due to enlarged lymph nodes diagnosed with lymphoma). The lowest acceptable limit for safety in the intervention group was a cancer detection rate of more than 3 per 1000 participants screened. The trial is registered with ClinicalTrials.gov, NCT04838756, and is closed to accrual; follow-up is ongoing to assess the primary endpoint of the trial, interval cancer rate. Findings: Between April 12, 2021, and July 28, 2022, 80 033 women were randomly assigned to AI-supported screening (n=40 003) or double reading without AI (n=40 030). 13 women were excluded from the analysis. The median age was 54·0 years (IQR 46·7-63·9). Race and ethnicity data were not collected. AI-supported screening among 39 996 participants resulted in 244 screen-detected cancers, 861 recalls, and a total of 46 345 screen readings. Standard screening among 40 024 participants resulted in 203 screen-detected cancers, 817 recalls, and a total of 83 231 screen readings. Cancer detection rates were 6·1 (95% CI 5·4-6·9) per 1000 screened participants in the intervention group, above the lowest acceptable limit for safety, and 5·1 (4·4-5·8) per 1000 in the control group-a ratio of 1·2 (95% CI 1·0-1·5; p=0·052). Recall rates were 2·2% (95% CI 2·0-2·3) in the intervention group and 2·0% (1·9-2·2) in the control group. The false positive rate was 1·5% (95% CI 1·4-1·7) in both groups. The PPV of recall was 28·3% (95% CI 25·3-31·5) in the intervention group and 24·8% (21·9-28·0) in the control group. In the intervention group, 184 (75%) of 244 cancers detected were invasive and 60 (25%) were in situ; in the control group, 165 (81%) of 203 cancers were invasive and 38 (19%) were in situ. The screen-reading workload was reduced by 44·3% using AI. Interpretation: AI-supported mammography screening resulted in a similar cancer detection rate compared with standard double reading, with a substantially lower screen-reading workload, indicating that the use of AI in mammography screening is safe. The trial was thus not halted and the primary endpoint of interval cancer rate will be assessed in 100 000 enrolled participants after 2-years of follow up. Funding: Swedish Cancer Society, Confederation of Regional Cancer Centres, and the Swedish governmental funding for clinical research (ALF).
Article
Background Double reading can be used in screening mammography, but it is labor intensive. There is limited evidence on whether trained radiographers (ie, technologists) may be used to provide double reading. Purpose To compare the performance of radiologists and radiographers double reading screening mammograms, considering reader experience level. Materials and Methods In this retrospective study, performance and experience data were obtained for radiologists and radiographer readers of all screening mammograms in England from April 2015 to March 2016. Cancer detection rate (CDR), recall rate (RR), and positive predictive value (PPV) of recall based on biopsy-proven findings were calculated for first readers. Performance metrics were analyzed according to reader professional group and years of reading experience using the analysis of variance test. P values less than .05 were considered to indicate statistically significant difference. Results During the study period, 401 readers (224 radiologists and 177 radiographers) double read 1 404 395 screening digital mammograms. There was no difference in CDR between radiologist and radiographer readers (mean, 7.84 vs 7.53 per 1000 examinations, respectively; P = .08) and no difference for readers with more than 10 years of experience compared with 5 years or fewer years of experience, regardless of professional group (mean, 7.75 vs 7.71 per 1000 examinations respectively, P = .87). No difference in the mean RR was observed between radiologists and radiographer readers (5.0% vs 5.2%, respectively, P = .63). A lower RR was seen for readers with more than 10 years of experience compared with 5 years or fewer, regardless of professional group (mean, 4.8% vs 5.8%, respectively; P = .001). No variation in PPV was observed between them (P = .42), with PPV values of 17.1% for radiologists versus 16.1% for radiographers. A higher PPV was seen for readers with more than 10 years of experience compared with 5 years or less, regardless of professional group (mean, 17.5% and 14.9%, respectively; P = .02). Conclusion No difference in performance was observed between radiographers and radiologists reading screening mammograms in a program that used double reading. Published under a CC BY 4.0 license Online supplemental material is available for this article. See also the editorial by Hooley and Durand in this issue.
Article
The promise of highly personalized oncology care using artificial intelligence (AI) technologies has been forecasted since the emergence of the field. Cumulative advances across the science are bringing this promise to realization, including refinement of machine learning- and deep learning algorithms; expansion in the depth and variety of databases, including multiomics; and the decreased cost of massively parallelized computational power. Examples of successful clinical applications of AI can be found throughout the cancer continuum and in multidisciplinary practice, with computer vision-assisted image analysis in particular having several U.S. Food and Drug Administration-approved uses. Techniques with emerging clinical utility include whole blood multicancer detection from deep sequencing, virtual biopsies, natural language processing to infer health trajectories from medical notes, and advanced clinical decision support systems that combine genomics and clinomics. Substantial issues have delayed broad adoption, with data transparency and interpretability suffering from AI's "black box" mechanism, and intrinsic bias against underrepresented persons limiting the reproducibility of AI models and perpetuating health care disparities. Midfuture projections of AI maturation involve increasing a model's complexity by using multimodal data elements to better approximate an organic system. Far-future positing includes living databases that accumulate all aspects of a person's health into discrete data elements; this will fuel highly convoluted modeling that can tailor treatment selection, dose determination, surveillance modality and schedule, and more. The field of AI has had a historical dichotomy between its proponents and detractors. The successful development of recent applications, and continued investment in prospective validation that defines their impact on multilevel outcomes, has established a momentum of accelerated progress.