Breast cancer is a significant global health challenge that affects both men and women, leading to cause-specific deaths. Current early screening interventions, such as digital mammography (DM), are susceptible to high false-positives and false-negatives. This paper explores the potential of convolutional neural network (CNN), a form of artificial intelligence (AI), to support screening mammography with the aim to enhance accuracy in lesion detection, image classification and diagnostic prediction. Because the adoption of AI in cancer diagnosis is still in its infancy, the objective of this paper is to provide insight into the benefits and limitations of deep learning-based approaches to detect and diagnose cancer. An analysis of the implementation of CNN in AI-screening mammography models was conducted, using the SWOT strategic analysis tool. Internal strengths that improve the predictive accuracy of CNN include transfer learning and data augmentation, whereas the internal weaknesses include a lack of data standardisation and reproducibility. External opportunities consist of increased sensitivity in differentiating between microcalcifications and non-tumorous structures, improved predictive diagnosis and reduced workload. Nevertheless, integration within clinical settings must also consider the external threats of breaching patient privacy, automation biases and the role of clinical judgement.
Elizabeth Yong1, Yen Nee Teo2, Leanne McKnoulty3, Ajeevan Gautam4, Rajib Chaulagain5, Kun Hing Yong4,*
1Indooroopilly State High School, Brisbane, Queensland 4068, Australia
2Instute of Malaysian and Internaonal Studies, Naonal University of Malaysia, Bangi 43600, Selangor, Malaysia
3GUPSA editor in residence, Grith University, Nathan, Queensland 4111, Australia
4School of Medicine and Denstry, Grith University, Nathan, Queensland 4111, Australia
5Department of Oral Pathology, Chitwan Medical College, Bharatpur, Chitwan 44207, Nepal
Received: 24 Jan, 2024
Accepted: 11 Mar, 2024
Published: 30 Mar, 2024
Key words: Arcial intelligence; Breast
cancer; Convoluonal neural network; Digital
mammography; SWOT analysis.
*Correspondence to: Kun Hing Yong, School of
Medicine and Denstry, Grith University, 170 Kessels
Rd, Nathan Queensland 4111, Australia.
Yong E, Teo YN, McKnoulty L, Gautam A, Chaulagain
R, Yong KH. HA SWOT analysis of breast cancer
diagnosis in digital mammography using deep
convoluonal neural network. Journal of Chitwan
Medic al Colle ge.2024;14(47):89-94.
Breast cancer is presently the most commonly diagnosed cancer
in women globally, and the second-leading cause of mortality
from cancer. Accurate cancer diagnosis of symptomac
paents at an early stage is pernent to improve cancer
outcomes, thereby reducing cause-specic deaths. In the
early 1990s in Australia, screening mammography programs
were implemented for the early detecon and treatment of
breast cancer, but their accuracy in sensivity and specicity
remains error-prone, leading to the reporng of false-posives
and false-negaves, respecvely. The addional imaging tests
and biopsies ensuing false-posive recalls can contribute to
unnecessary emoonal stress for the paent. Similarly, health
hazards from high radiaon exposure should also be considered.
Errors in interpretaon and detecon of abnormalies can
be aributed to dierent breast densies, small tumours or
arfacts.1 Another limitaon is subjecvity in image analysis
due to varied percepons across interpreters, known as inter-
reader variability. During double reading to improve diagnosc
accuracy, two radiologists independently read the same
screening mammography.2 Despite image analysis performed
manually by experts, factors such as fague and decreased
aenon can adversely aect the results ndings. Furthermore,
double reading is labour intensive, implying that the me
constraints on clinical evaluaons and examinaons can lead
to a delegaon of tasks from radiologists to other physicians
or breast clinicians. This can lead to unfavourable outcomes
for the paents, being subjected to higher posive recall rates
and false-posive interpretaons, because physicians may lack
in sucient radiological knowledge to exert accurate clinical
With rapid development in compung power and data, AI has
been increasingly integrated in clinical sengs. Among them
is machine learning and, in parcular, deep learning with
CNN diagnosc-based approaches, whereby the technology is
trained to recognise complex paerns from raw input with its
mul-layered networks and make accurate connecons based
on the context. Its ulity in lesion detecon, image classicaon
and diagnosc predicon enable addional aid to radiologists
to achieve higher accuracy when interpreng DM, thereby
serving as a prospecve applicaon to improve diagnosis
of breast cancer.4 The applicaons of these technological
innovaons have understandably raised concerns among
healthcare professionals, in regard to its feasibility and
diagnosc ecacy. To address the concerns of AI applicaons
in medical imaging, an understanding of the benets and
limitaons of AI tools is necessary.5
A literature review of research published during the last 5
years was conducted to evaluate the strengths, weaknesses,
opportunies and threats (SWOT) of CNN in AI models used
to diagnose breast cancer. A brief analysis is provided, while
the primary points are outlined in Table 1. This SWOT analysis
forms the basis for governmental decision-makers and health
care providers to understand the potenal implementaon of
AI within clinical sengs, and to consider future improvements
in approaching the problem.
Secon 2 introduces the funconality of CNN, Secon 3
elucidates the strengths of applying CNN in mammography
to diagnose breast cancer, Secon 4 explains the external
opportunies, and Secons 5 and 6 discuss the weaknesses
and threats or ethical challenges. Finally, Secon 7 presents
the suggested future direcons and the conclusion.
In deep learning, a CNN is a class of deep neural network that
uses algorithms to process a large quanty of data with a grid
paern, notably in image-related analysis.6 CNN is employed
for image examinaon, idencaon or classicaon because
it can eciently extract features from images and simplify
them for beer analysis. It consists of three disnct layers
with funcons that interconnect each other, namely an input
layer, mulple hidden layers and an output layer. The inial
DM image undergoes ltering in the rst convoluonal layer,
which enhances the features, removes unwanted noise, and
helps to dierenate the edges and shapes of the region
under invesgaon. Subsequent convoluonal layers enhance
the feature paerns to facilitate idencaon of tumour
contour and enable the extracon of specic features, such as
structural paerns or dominant outliers in the image, making
CNN highly ecient for image processing.7 The pooling layer
lters the minimum, maximum, mean or median of the set of
pixels within the image that fall within the lter, to reduce the
spaal size and maintain only the most crucial informaon.8
Decreasing the parameters increases the processing speed.
The informaon is subsequently passed through the fully
connected layer, where extracon of inputs from feature
analysis and applicaon of weights and predicts the output
into classes of cancer. For example, in the study by Ragab and
colleagues9, the fully connected layer classied abnormal areas
as benign or malignant, while various other studies classied
regions as benign, malignant or without tumour. Figure 1
depicts the structure of a classic CNN architecture.
Transfer learning
Transfer learning refers to leveraging the learned features of
a pre-trained model as the foundaon for training a model to
perform a new task. It takes advantage of the fact that neural
networks trained on large databases of images, such as those
with ImageNet, have learned and established parameters in
the early layers relevant to numerous visual tasks, despite the
specic task they are programmed to perform.10 Salehi and
colleagues explained that certain funcons of CNNs in lower
layers, such as those dedicated for edge, texture and paern
detecons, can be calibrated and applied to higher layers of
the network.10 However, the specic features that must be
learned will increase in complexity where, for instance, the
output layer would only respond to images of a specic tumour
that it had been trained to detect. Thus, using a pre-trained
model and customising the new model with addional new
layers and adjustments to the number of neurons or classes
depending on the specic task requirements has the benet of
minimising training me and requires limited data. This means
earlier models can be rened and adapted to various tasks,
including detecng and classifying lesions, without retraining
a deep neural network from scratch.
pooling layer, and fully connected layer
Note. The nal output is classied as normal, benign or
In medical imaging where the number of fully annotated
mammograms available is limited, training a deep learning
model with data augmentaon ensures improvement to the
models while also minimising data overng. Overng is
a stascal error whereby the model ts too closely to the
trained dataset and cannot be generalised to new data.11
Data augmentaon enables arcial expansion on exisng
datasets to generate modied copies and, hence, introduces
a vast variety of paerns that the model can recognise and
learn from. Improvement to data variability is demonstrated
to enhance the predicve accuracy of the AI models in
detecng suspicious regions of interest when presented with
normal and abnormal DMs.12,13 This provides the radiologist
with psychological support, by reducing the cognive burden
associated with idenfying potenal lesion regions.
For example, GAN-based augmentaon, an unsupervised deep
learning method that extracts hidden properes from data to
formulate its decision-making process, has shown potenal
to improve accuracy in mass classicaon aer geometric
transformaons from unrelated masses or increase in noise
distorons.12 As such, it has also been a widely used approach
in breast mass detecon and mass segmentaon.13 As the
use of data augmentaon methods expands, it is pernent to
evaluate the quality of the output and recognise that building
upon minimal databases can restrict the generalisaon
ability of the model and potenally reinforce inherent biases.
With higher resoluon DM images, convenonal computer-
aided diagnosis (CAD) models can disnguish between benign
and malignant lesions by assessing their greyscale levels,
homogeneity, gradient, paerns and shape.14 However,
because dense breast ssue appears white and has similar
shade and intensity values as tumorous regions containing
microcalcicaons, dense breast ssue, with relavely high
amounts of glandular ssue and brous connecve ssue,
can hide lesions and is prone to misdiagnosis and reporng of
false negaves. With AI screening, it can perform detecon of
potenally tumorous region and compare its intensity value
with other regions of the breast followed by segmentaon
of the tumour area surrounded with malignant ssues.15
This can reduce the lower sensivity from human perceptual
error, because it separates pixels of cancer region from normal
region. Geras et al. showed that the addion of the deep
learning method, which learns the intermediate and abstract
representaons of the data, can improve accuracy in lesion
classicaon in DMs, reaching similar sensivity to radiologists’
Given the large processing capacity of AI, its capability of
analysing and processing data from wide-ranging sources,
including medical images, laboratory test results and paent
history, enables idencaon of paerns and abnormalies
that may otherwise be missed by human experts. Missed
microcalcicaons can be aributed to their small size or
concealment by overlying high amounts of brous and glandular
ssues.16 Therefore, implemenng AI in mammography has the
potenal to increase sensivity in dierenang between the
microcalcicaons and non-tumorous anatomic structures,
such as increased breast density. It employs image processing
techniques to spaally lter the DM and improve signal-
to-noise rao, yielding higher sensivity for detecng true
abnormalies.15 In a study by Kim et al., the classicaon
performance of AI-CAD demonstrated a higher accuracy value
of 0.938–0.970 compared to an accuracy value of 0.810–0.881
achieved by radiologists.17 Findings by Liu and colleagues
also reported that combining the deep learning model into
mammography aained similar diagnosc performance to
that of an experienced radiologist, and signicantly surpassed
the performance of a junior radiologist (p=0.029; p<0.05).18
The improvement indicated promising results in reducing the
quanty of unnecessary biopsies performed, showing potenal
for early detecon and intervenon of breast cancer.
Reduced workload
Numerous European countries have employed double reading
with arbitraon, whereas the United States typically has
employed single reading with CAD.19 While standard double
reading has been shown to reduce recall rates, it is labour
intensive. A study by Dembrower and colleagues compared
the cancer detecon rates and eciency of varying methods
of interpretaon: single reading by AI, double reading by two
radiologists, double reading by one radiologist and AI and triple
reading by two radiologists and AI.20 The ndings suggested
that the performance for triple reading (95% CI 1.04–1.11)
outperformed the double reading by one radiologist and AI or
by two radiologists (95% CI 1.00–1.09). Triple reading increased
recalls by 5% and consensus discussion by 50%, while double
reading by one radiologist and AI decreased recalls by 4%
with a reasonable number of consensus discussion. In triple
reading, the percepon of the combined radiologists was
favoured over the percepon of the AI, indicang that the
ability of AI in detecng cancer was under-esmated rather
than over-esmated, explaining the slightly higher recall
rates. Because the higher abnormal interpretaon rate for AI
and one radiologist did not translate into an increased recall
rate, it would help reduce workload me, which had been
demonstrated to be by nearly 40%.20,21 Replacement of the
second reader with AI would substanally reduce the me
radiologists spend reading mammograms. Another study
by Lång and colleagues determined that mammography
screening supported with AI yielded similar cancer detecon
rate as standard double reading, with the recall rate being 0.2%
higher at 2.2% , suggesng that the use of AI in mammography
can be considered.22
Standardisaon within a clinical seng can help improve
interoperability and vast exchange of health data and
informaon. This is pernent to improve performance of
the models in imaging acquision and processing, because
the quality of image acquision aects radiomic feature
calculaons, radiomics being the extensive image-based
phenotyping of abnormalies through extracon of diverse
feature values from medical images.23 Currently, insucient
standardizaon is evident in the collecon and storage of
unstructured data, as well as in the process of unifying data
that represents a single healthcare system.24 Substanal
informaon technology and systems resources is required
to implement this, and the feasibility remains under acve
One method proposes using paent-reported outcomes (PROs)
and validated quesonnaires, as they are valuable survival
indicators that can benet cancer care delivery, research
and clinical operaons.25 Nonetheless, several limitaons are
Transfer learning:
- Minimises training me and requires limited data through modifying a pre-trained model, tailored
to suit specic requirements
Data augmentaon:
- Minimises data overng
- Improves generalisability, image recognion, segmentaon accuracy and analysis
- Enhances predicve accuracy in tumour classicaon
- Lack of standardisaon limits interoperability
- Limitaons in obtaining and implemenng paent-reported outcomes
- Data reproducibility is subjected to data dri
- Lack of high-quality and mul-instuonal datasets may reduce generalisability
- Dierent mechanisms performed at each CNN layer require varying levels of complexity during
Pixel-level image classicaon:
- Reduces false posives due to ability to discern between tumorous and non-tumorous regions
- Improves cancer detecon rates
Improved paent value through predicve diagnosis:
- Increases sensivity for detecon of true abnormalies
- Improves tumour classicaon accuracy
- Potenal to reduce quanty of unnecessary biopsies and medical costs.
Opportunies for healthcare professionals:
- Enhances reading eciency by reducing number of tests requiring radiologist interpretaon
- Reduces workload
Paent privacy:
- Breach of health and personal informaon
- Lack of transparency
Algorithmic biases:
- Biases in input training data can produce skewed results and exacerbate health care inequality
- Ethical concerns regarding the role of AI in clinical judgement
Role of human judgement:
- Medico-legal responsibility for healthcare providers if incorrect diagnosis is made
- Potenal discordance between clinical pracces and AI suggesons
- Impairment in clinical judgement from over-reliance on AI technology, resulng in potenal pa-
ent injury
- Jeopardizaon of the learning process and clinical reasoning abilies of medical students or novice
present. These include paent-level barriers such as disability,
challenges in reading and responding to the quesonnaires or
with recalling their symptoms, clinical-level obstacles like lack
of sta training with interpreng and implemenng PROs into
clinical pracces, and service-level challenges like lack of PRO
data logging into electronic medical records within a hospital
Data Reproducibility
Data reproducibility is limited when transferred across
healthcare systems and global communies, but even within
the training environment, data dri over me for AI algorithms
and advanced CDSS can aect their performance. This is a
result of variaons in distribuon, formang or quality of
data, awed data transformaon, absence of natural dri
when training the model or covariate shi.27 Thus, standards
must be incorporated to connuously monitor AI algorithms
and ensure their validity even if AI were to be successfully
implemented as a technological pracce in medicine due to
their evolving nature.
Threats regarding Ethical Challenges
Precision medical technology relies on extensive medical
informaon for cancer diagnosis, screening, data processing,
opmising care delivery and conducng clinical operaons.
To train models eecvely, medical researchers need access
to paents’ personal health records. However, concerns
arise regarding the potenal misuse of data, leading to issues
like identy the, insurance fraud and illegal acquision of
prescripon drugs. To ensure ethical use of paent data in
clinical pracce, medical researchers must be transparent about
how data will be used. Addionally, they should implement
robust safety measures to safeguard paent privacy and obtain
informed consent from individuals contribung their data.
Algorithmic Biases
