Conference PaperPDF Available

CerviNet: A Novel Approach for Pap-smear Images Analysis for Cervical Cancer Classification

Authors:

Abstract and Figures

Cervical cancer is the fourth most common disease among women worldwide, and pap smear images are used as a major diagnostic technique to detect precancerous and cancerous abnormalities in the cervix, vagina, and vulva. To solve the difficulties associated with manual assessment, deep learning algorithms have gained popularity in the development of automated computer-aided diagnostic systems. The present study introduces an innovative hybrid approach with the objective of attaining effective and accurate categorization of cervical cells. This paper introduces a comprehensive methodology for employing a deep learning model to classify cervical cancer. The model employs advanced techniques such as resampling and augmentation, which employ random horizontal flips and rotations. These strategies improve the model's ability to withstand and recover from data preparation. We use Vision Transformer's (ViT) linear projection and position embedding to change the input images into patches that can be sent to a transformer encoder. A fusion architecture is established by incorporating supplementary convolutional layers, followed by a fully connected layer, to improve the features extracted by the model. The ViT-based model is developed using pre-trained weights and allows for fine-tuning in order to efficiently address problems pertaining to the classification of cervical cancer. To enhance the quality of these cell images, we employ median smoothing and Gaussian filtering as pre-processing techniques The results of the experiment demonstrate the potential of the proposed methodology for improving the precision of cervical cancer classification. Notably, our model exhibited outstanding accuracy on the 2-state classification on the Harlev dataset and the 3-state classification on the SIPaKMeD dataset, attaining speeds of 98.07% and 98.08%, respectively. The model's ability to effectively categorize cervical cancer images across various datasets is evidenced by the accuracy rates specific to each dataset. This indicates the model's robustness and promise for practical clinical use.
Content may be subject to copyright.
CerviNet: A Novel Approach for Pap-smear Images
Analysis for Cervical Cancer Classification
Khowaja Ashfaque1, Zou Beiji1 and Xiaoyan Kui1
1 Central South University, Changsha, 336017, Hunan, China
214718003@csu.edu.cn
Abstract. Cervical cancer is the fourth most common disease among women
worldwide, and pap smear images are used as a major diagnostic technique to
detect precancerous and cancerous abnormalities in the cervix, vagina, and vulva.
To solve the difficulties associated with manual assessment, deep learning algo-
rithms have gained popularity in the development of automated computer-aided
diagnostic systems. The present study introduces an innovative hybrid approach
with the objective of attaining effective and accurate categorization of cervical
cells. This paper introduces a comprehensive methodology for employing a deep
learning model to classify cervical cancer. The model employs advanced tech-
niques such as resampling and augmentation, which employ random horizontal
flips and rotations. These strategies improve the model's ability to withstand and
recover from data preparation. We use Vision Transformer's (ViT) linear projec-
tion and position embedding to change the input images into patches that can be
sent to a transformer encoder. A fusion architecture is established by incorporat-
ing supplementary convolutional layers, followed by a fully connected layer, to
improve the features extracted by the model. The ViT-based model is developed
using pre-trained weights and allows for fine-tuning in order to efficiently ad-
dress problems pertaining to the classification of cervical cancer. To enhance the
quality of these cell images, we employ median smoothing and Gaussian filtering
as pre-processing techniques The results of the experiment demonstrate the po-
tential of the proposed methodology for improving the precision of cervical can-
cer classification. Notably, our model exhibited outstanding accuracy on the 2-
state classification on the Harlev dataset and the 3-state classification on the
SIPaKMeD dataset, attaining speeds of 98.07% and 98.08%, respectively. The
model's ability to effectively categorize cervical cancer images across various
datasets is evidenced by the accuracy rates specific to each dataset. This indicates
the model's robustness and promise for practical clinical use.
Keywords: Cervical Cancer, Pap Smear Images, Classification, Vision Trans-
former.
1 Introduction
Cervical cancer is becoming more common around the world, especially in low-income
areas of developing countries, and it is highly treatable [1]. It is the third most common
cause of mortality and the fourth most common kind of cancer in women [2]. Annually,
2 Khowaja A. et al
about 650,000 women across the globe receive a diagnosis of cervical cancer [3]. Based
on Global Cancer Statistics (GCS), the disease claims the lives of 350,000 individuals
each year [4]. In developing nations with minimal resources, cervical cancer constitutes
roughly 15% of malignancies in women, while in wealthy countries, it accounts for
around 4% [5]. The majority of female fatalities, specifically over 85%, take place in
undeveloped nations, typically among individuals aged 40 to 60 years [6].
The best ways to avoid cervical cancer are to obtain an HPV vaccine, have frequent
Pap exams, and be married. Cervical cancer is defined by the proliferation of malignant
(cancer) cells in the cervix tissues, with human papillomavirus (HPV) infection being
the predominant risk factor. The Pap test is one method used to identify and diagnose
cervical cancer [7]. This test uses a piece of cotton and a brush to collect cells from the
cervix and vaginal area, which are subsequently examined under a microscope for ab-
normalities; this procedure is also known as a pap smear [8]. Cervical cancer identifi-
cation is difficult because women exhibit no symptoms in the early stages of the dis-
ease. Screening for pre-tumours before they turn into visible growths is a well-known
approach to preventing cervical cancer.
Pap-smear cell images can be used to assess the shape of malignant cells and deter-
mine the kind of cervical cancer. Cell nuclear characteristics and cell cytoplasm are the
most important variables in distinguishing cervical cancer types based on test images.
Manually diagnosing the malignancy type from these samples is a time-consuming and
demanding procedure for cytotechnologists.
Recent progress in Artificial Intelligence (AI) research has resulted in the develop-
ment of Machine Learning (ML) models that can autonomously identify and classify
different forms of cancer. Deep learning, which is a subset of machine learning, has
demonstrated remarkable efficacy in the field of medical imaging. Medical imaging-
based, intelligent automated systems are essential for evaluating cervical cells for indi-
cations of malignancy [9]. With the emergence of new technologies, the process of data
processing has become more efficient in terms of cost and time. Cervicography, pap
smears, and colposcopy are becoming more popular compared to conventional treat-
ments [10]. Although these tests are impartial in comparison to physical inspection,
they are considered valuable instruments that support experts rather than substitute their
subjective interpretation [11]. Medical advancements have facilitated the detection of
cervical cancer by using imaging factors, such as the shape and colour of the cytoplasm
and nucleus, to predict abnormalities in cell patterns. Pap testing, among various testing
methods, has proven to be a cost-effective method for identifying cells that could po-
tentially develop into cancer [12]. To gathered smears are assessed by two experts in
order to minimise the possibility of incorrect negative results. Nevertheless, the ex-
tended manual examination of smears by medical professionals can result in inaccura-
cies in diagnostic interpretation as a result of cognitive and physical exhaustion [13].
Moreover, it necessitates a substantial level of technical expertise, which in turn leads
to higher expenses for inspections.
The aim of this study is to address the increasing need for advanced tools in the
detection of cervical cancer through the use of a state-of-the-art deep learning model.
Our technique is based on combining the Vision Transformer (ViT) and Convolutional
Neural Network (CNN) layers. ViT exhibits remarkable expertise in capturing intricate
CerviNet: A Novel Approach for Pap-smear Images Analysis for Cervical Cancer 3
relationships within images, while CNNs are highly recognised for their ability to ex-
tract spatial hierarchies. The goal of combining these systems is to use their individual
strengths, resulting in a categorization framework that is both resilient and precise. Our
research not only provides a significant addition to the field of cervical cancer diagnosis
but also introduces innovative methods for data preparation. Our research also applies
sophisticated preprocessing techniques to improve the quality of pap smear images. We
use median smoothing and Gaussian filtering as crucial preprocessing procedures to
reduce noise and enhance the overall clarity of cellular structures. Utilising a pre-
trained ViT model enables efficient fine-tuning to accommodate the complexities found
in collections of cervical cancer images. The block diagram of CerviNet is shown in
Fig. 1.
The following are the primary contributions of our suggested research:
Presenting an advanced model that combines the strengths of Vision Trans-
former (ViT) and Convolutional Neural Network (CNN) architectures to
achieve accurate classification of cervical cancer for 2-state and 3-state
classification on two datasets.
We chose the CNN layers to improve the overall performance and diagnos-
tic accuracy of the model by extracting specific features, which helps boost
the global understanding of ViT.
Techniques for enhancing the quality of pap smear images by data prepro-
cessing. Important preprocessing techniques include median smoothing and
Gaussian filtering.
We conduct a comprehensive assessment of the effectiveness of the sug-
gested models by using multiple performance evaluation criteria, including
accuracy, precision, sensitivity, F1-score, Area Under Curve (AUC), and
confusion matrix.
The following describes the way the manuscript is organized: Section 2 discusses
related work. Section 3 covers the data description, pre-processing techniques, and our
proposed work. Section 4 includes an analysis of results, system requirements, and per-
formance assessment. Finally, Section 5 describes conclusions and future work.
Fig.1 Block diagram of overall framework of our study.
4 Khowaja A. et al
2 Related Work
Several recent studies have been undertaken on the diagnosis of cervical cancer. Med-
ical professionals and scientists globally are actively involved in continuous research
to enhance their understanding of cervical cancer, with a specific emphasis on preven-
tion, remedies, and efficient treatments for those who have been diagnosed with this
ailment. With the ongoing progress of artificial intelligence (AI) and image processing
technology, we are investigating advanced diagnostic approaches used in medical im-
age processing, specifically in relation to cervical cancer.
Chen et al. [14] developed a lightweight CNN (LCNN) using compression to boost
performance. They employed VGG, ResNet, and Inception-ResnetV2 as the teacher
model and Xception, MobileNet, MobileNetV2, and DenseNet for the student model.
Inception-ResnetV2 yielded 73.58% accuracy but faced hyperparameter tuning chal-
lenges.
Manna et al. [15] introduced an ensemble CNN model (CNN-F) using fuzzy rank
for cervical image classification. The model fused pre-trained InceptionV3, Xception,
and DenseNet169, outperforming other models in accuracy. However, it faced chal-
lenges in classifying overlapping and poor-contrast images. Pal et al. [16] introduced
deep multiple instance learning (DMIL) for cell classification, utilizing Deep CNN to
extract features and obtaining an average accuracy of 84.55%. Ghoneim et al. [17] de-
vised a technique for classifying cervical cancer cells using convolutional neural net-
works. CNN was utilized during the feature extraction phase. The input images were
categorized into normal and abnormal classes using the extreme learning machine
(ELM). The experiments relied on the Herlev database. The CNN-ELM-based system
achieved a classification accuracy of 91.2% for 7 classifications.
Mesquita et al. [18] introduced a cervical cancer detection system that utilises tex-
tural information obtained through the use of a randomised neural network signature
(RNNS). The analysis of the Herlev dataset revealed a classification accuracy of ap-
proximately 87.75%. The disadvantages of this include the excessive number of input
parameters and the susceptibility to rotation. In the study of Win et al. [19], cervical
cancer was classified using an ensemble classifier that combined SVM, linear discrimi-
nant, bagged trees, k-nearest neighbours, and boosted trees. The SIPaKMeD dataset
achieved an accuracy rating of 98.27% for 2-class classification and 94.09% for 5-class
classification. Deo et al. [20] emphasise the value of using pap smear images as the
primary diagnostic technique. In response to this need, their paper proposes Cervi-
Former, a novel transformer-based model for accurate cervical cancer categorization.
CerviFormer outperforms on publicly available datasets, with an accuracy of 93.70%
for 3-state classification on the SIPaKMeD dataset and 94.57% for 2-state classification
on the Herlev dataset.
Existing models struggle to achieve optimal outcomes due to concerns such as over-
fitting, parameter adjustments, and gradient vanishing. Furthermore, existing models
require significant processing effort when trained over many epochs. As a result, there
is an urgent need for the development of a more efficient model that can save time and
computational resources by training on fewer epochs.
CerviNet: A Novel Approach for Pap-smear Images Analysis for Cervical Cancer 5
3 Materials and Method
This section offers a comprehensive overview of the datasets, including its pre-pro-
cessing steps, and a description of the proposed model.
3.1 Datasets
We trained and evaluated the model on two publicly available Pap smear datasets.
SIPaKMeD Dataset. The SIPaKMeD dataset contains 4049 distinctive images of cer-
vical cells [21]. Images are classified into five groups: parabasal, superficial intermedi-
ate, dyskeratotic, koilocytotic, and metaplastic. The five categories can be classified as
normal, abnormal, or benign. The normal category includes parabasal and superficial
intermediate, with 1618 images. The abnormal class consists of dyskeratotic and koil-
ocytotic categories, with 1638 images. The benign group includes metaplastic with 793
images. A few samples from the SIPaKMeD dataset are shown in Fig. 2.
Harlev Dataset. In the Herlev dataset [22], 917 cell images are categorized into seven
groups: carcinoma in situ, mild dysplasia, moderate dysplasia, severe dysplasia, colum-
nar squamous, and intermediate squamous. These seven groups can also be classified
as normal or abnormal. Additionally, it is noteworthy that 74% of the images are clas-
sified as abnormal, with the remaining 26% representing normal cases, as seen in Fig.
3.
Parabasal
Superficial In-
termediate
Dyskeratotic
Koilocytotic
Metaplastic
Normal
Abnormal
Benign
Fig.2 Samples of SIPaKMeD Dataset.
Normal
Interme-
diate
Normal
Colum-
nar
Carci-
noma in
Situ
Light
Dysplas-
tic
Moderate
Dyspla-
sia
Severe
Dysplasia
Normal
Abnormal
Fig.3 Samples of Harlev Dataset.
6 Khowaja A. et al
3.2 Data Pre-processing
In the pre-processing stage of the data, we focus on handling the variety of image for-
mats and dimensions present in the datasets. The images are initially in BMP format
and come in different sizes. We resize them to a uniform size of 224x224 pixels and
convert them to the commonly used JPG format in order to standardize the input for
our model and ensure consistent performance. We use Gaussian smoothing and Bayes-
ian filtering to improve the image quality and help the model better interpret visual
information. While Bayesian filtering enhances the characteristics already present in
the images, Gaussian smoothing is used to lower noise and increase image features. All
of these pre-processing stages combine to improve the interpretability and accuracy of
the model in later classification tasks.
3.3 Overview of Proposed CerviNet Model
To increase the effectiveness and accuracy of cervical cancer classification in diagnosis,
we have proposed a novel fusion model, the CerviNet model, utilizing cutting-edge
deep learning techniques. Our method is predicated on a meticulously developed deep
learning architecture that makes use of current developments in the area. Fig.4 shows
the overall structure of the proposed model.
Figure.4 Architecture of proposed model CerviNet.
Vision Transformer (ViT) Encoding. The Vision Transformer (ViT), a transformer-
based architecture renowned for its capacity to extract long-range correlations in im-
ages, is the central component of our methodology. During the Vision Transformer
(ViT) encoding step, images are separated into smaller patches, which improves pro-
cessing efficiency. By moving patches to a lower-dimensional embedding space, linear
projection lowers computing complexity. This may be stated mathematically as:
(1)
where represents the projected vector, is the input patch vector, is the weight
matrix, and is the bias vector.
CerviNet: A Novel Approach for Pap-smear Images Analysis for Cervical Cancer 7
Positional Embeddings. Positional embeddings are then used to preserve spatial rela-
tionships. Each patch is assigned a positional embedding vector, which is added to the
projected vector. Mathematically:
󰇛󰇜 (2)
where 󰇛󰇜 represents the positional embedding for position .
Transformer Encoder with Self-Attention. The transformer encoder in our model
utilizes self-attention techniques, enabling it to capture long-range dependencies and
contextual interactions across patches for global feature interpretation. The attention
mechanism calculates a weighted sum of values, where the weights are determined by
the similarity between the query and key vectors. This can be mathematically expressed
as:
󰇛󰇜  
(3)
where , and are the query, key, and value matrices, respectively, and is the
dimension of the key vectors.
Integration of Convolutional Layers. Convolutional layers are introduced to supple-
ment ViT's global comprehension by extracting local features. The mathematical rep-
resentation of this convolution operation is as follows:
󰇛󰇜 (4)
Fusion Architecture. The fusion architecture concatenates features obtained from ViT
and convolutional layers. The mathematical representation of the concatenation process
is as follows: 󰇟󰇠 (5)
where represents the fused features, and are the feature vectors extracted
from ViT and convolutional layers, respectively.
Utilization of Fully Connected Layers. Additional fully connected layers are subse-
quently utilized to learn complicated nonlinear combinations of the fused characteris-
tics. The output of these layers can be estimated as follows:
󰇛 󰇜 (6)
where represents the final output, is the weight matrix, and is the bias vector.
This CerviNet architecture maximizes ViT's attention processes by retrieving spatial
hierarchies using convolutional layers, resulting in a comprehensive representation of
cervical cell images.
Our model uses pretrained weights for the ViT basis, allowing for quick adaption to
the subtleties of cervical cancer classification tasks. The ability to fine-tune the model
provides optimal alignment with the specific features of cervical cell images, resulting
in maximum diagnostic accuracy. Table 1 presents the outlining key architectural ele-
ments, parameters, and settings employed in the model.
8 Khowaja A. et al
Table 1. Model layers overview and parameters used in our model.
Model Type
Layer (type)
Output shape
Parameters
Trainable
ViT
Conv2d
[32, 768, 14, 14]
590,592
False
Encoder
[32, 197, 768]
151,296
False
Dropout
[32, 197, 768]
---
---
Sequential
[32, 197, 768]
85,054,464
False
LayerNorm
[32, 197, 768]
1,536
False
Linear (head)
[32, 1000]
769,000
False
CNN
Conv2d (Conv Layer)
[32, 64, 1, 1]
576,064
True
Linear (FC Layer)
[32, 3] or [32, 2]
455
True
4 Results and Discussion
The experimental hardware setup includes an Intel Iris Xe with 8GB memory and a
12th Generation 2.3 GHz Intel Core i7-12700H processor equipped with 16GB RAM.
For the software platform, we use Windows 11 (64-bit) as the operating system. Train-
ing is conducted on Google Colab Pro [23], a proficient cloud-based environment that
leverages a GPU for efficient machine learning and deep learning tasks. The GPU de-
vice utilized for training was an NVIDIA Tesla T4 GPU. The experiment utilizes the
PyTorch library, and the development environment is seamlessly facilitated by Python
version 3.10.12.
4.1 Comparative Analysis
In the Comparative Analysis on both the Herlev and SIPaKMeD datasets, our model
performs well in training. After 50 epochs of training, our model achieves outstanding
accuracies of 98.08% on the Herlev dataset and 98.08% on the SIPaKMeD dataset. The
training process for each dataset takes about 18 minutes and 57 seconds for the Herlev
Dataset and 50 minutes and 40 seconds for the SIPaKMeD Dataset.
Fig.5 illustrates the Confusion Matrices, which provide extensive breakdowns of the
model's classification performance for both datasets. These matrices enable extensive
evaluations of the model's performance across various classes, providing significant
insights into its strengths and places for improvement. Table 2 shows the comparative
analysis of model on both the datasets for training, validation and testing sets. Where
(A) represents the accuracy, (P) shows precision, (R) is used for recall and (F) is for
F1-score. Evaluation metrics used for this model is shown in Table 3.
Table 2. Comparative analysis of model in terms of training, validation and testing sets (%).
Dataset
Training
Validation
Testing
A
P
R
F
A
P
R
F
A
P
R
F
Herlev
98.07
98.63
98.00
98.31
93.18
93.57
93.22
93.29
97.29
96.83
92.42
94.57
SIPaKMeD
98.08
98.57
98.71
98.64
92.53
93.45
92.57
92.53
91.13
91.51
91.13
91.17
CerviNet: A Novel Approach for Pap-smear Images Analysis for Cervical Cancer 9
Table 3. Evaluation Metrics
Assessment
Formula
Accuracy
 
  
Precision

 
Recall

 
F1-Score
Where TP is true positive, TN is true negative, FP and FN show false positive and
false negative respectively.
Table 4 provides a comprehensive examination of our proposed model compared to
various cutting-edge current models, with an emphasis on accuracy metrics. This com-
parison allows us to evaluate our model's performance against known benchmarks in
the field of cervical cancer classification. We compute the class-wise precision, recall
and F1-score of both the datasets using the test dataset, as shown in Table 4. Fig.5 also
shows the Area Under the Curve (AUC) of the Receiver Operating Characteristic
(ROC) curve for both datasets, which demonstrates the model's effectiveness across
individual classes. The AUC ROC curve is a useful tool for determining a classifier's
ability to distinguish between positive and negative examples at various threshold lev-
els. By comparing these metrics and visuals, we gain a better grasp of our proposed
model's discriminative capability and overall effectiveness when compared to previous
methodologies.
Herlev Dataset
SIPaKMeD Dataset
Figure.5 Confusion Matrices.
10 Khowaja A. et al
Table 4. Class-wise evaluation metrics of both datasets.
Dataset
Class
Precision
Recall
F1-score
Herlev
Abnormal
0.9635
0.9635
0.9597
Normal
0.8939
0.8939
0.8873
SIPaKMeD
Abnormal
0.9600
0.8780
0.9989
Benign
0.7926
0.8629
0.8228
Normal
0.9398
0.9590
0.9479
Table 5. Comparison of current study with other state-of-the-art models
Study
Dataset
Method
Accuracy
Maurya et al. [24]
SIPaKMeD
CNN-LSTM
95.80%
ViT-CNN
97.6%
Deo et al. [20]
Herlev
Cross attention and latent trans-
former
94.57%
SIPaKMeD
93.70%
Proposed CerviNet
Herlev
ViT and CNN Confusion model
98.07%
SIPaKMeD
98.08%
Herlev Dataset
SIPaKMeD Dataset
Fig.6 AUC for both the datasets.
4.2 Ablation Study
In our ablation study, we conducted tests to see how different components affected the
efficacy of our ViT-based model for cervical cancer classification. First, we trained a
ViT model without CNN layers, achieving an impressive accuracy of 93.21%. Figure
4 shows the training accuracy and loss graphs, which provide insights into the model's
learning processes. Furthermore, we conducted comparisons with and without data pre-
processing to determine its impact on classification results. These comparisons are sum-
marized in Table 6, demonstrating the efficacy of our data pre-processing strategies in
enhancing model performance. We also looked at how pre-trained weights affected the
performance of the ViT model by comparing models trained with pre-trained weights
CerviNet: A Novel Approach for Pap-smear Images Analysis for Cervical Cancer 11
to those trained from scratch, as shown in Table 7. These ablation experiments provide
valuable insights about the contributions of many components to the overall efficacy of
our cervical cancer classification model.
Table 6. Comparative analysis of our model with and without pre-processing
Dataset
With Pre-processing
Without Pre-processing
Herlev
98.07%
96.65%
SIPaKMeD
98.08%
95.14%
Table 7. Comparative analysis of our model with and without ViT pre-trained weights
Dataset
ViT Pre-trained Weights
Without ViT Pre-trained Weights
Herlev
98.07%
93.81%
SIPaKMeD
98.08%
94.23%
Fig.6 Training accuracy and loss graph of ViT without CNN layers.
5 Conclusion
In conclusion, our study addresses the critical need for accurate and efficient classifi-
cation of cervical cancer, a common disease impacting women globally. We presented
a hybrid strategy that uses deep learning algorithms and new methodologies to improve
cervical cell classification using pap smear images. Our complete methodology in-
cludes advanced techniques like resampling, augmentation, and pre-processing, which
improve the model's durability and feature extraction capabilities. We achieved out-
standing accuracy in cervical cancer image classification across various datasets by
combining Vision Transformer (ViT) and convolutional layers in a fusion architecture.
The experimental findings demonstrate the efficacy of our suggested methodology,
with our model reaching remarkable accuracies of 98.07% on the 2-state classification
of the Harlev dataset and 98.08% on the 3-state classification of the SIPaKMeD dataset.
These findings illustrate not just our model's robustness and promise but also its poten-
tial for actual therapeutic application. By consistently identifying cervical cancer
12 Khowaja A. et al
images, our approach has major implications for early detection and therapy, potentially
improving patient outcomes and lowering mortality rates.
In the future, we plan to expand the training of the proposed model using a colposcopy
dataset to increase its usefulness in classifying cervical cancer. In addition, our objec-
tive is to investigate the adaptability of our hybrid methodology for tackling a wide
range of medical and non-medical tasks, thereby making a valuable contribution to the
advancement of automated diagnostic systems. This study represents a notable ad-
vancement in the classification of cervical cancer and establishes the foundation for
further applications in the fields of medical imaging and other related areas.
Acknowledgments. The work was supported by the National Key R&D Program of China
(No.2018AAA0102100); the National Natural Science Foundation of China (No. U22A2034,
62177047); the Key Research and Development Program of Hunan Province (No.2022SK2054);
Central South University Research Programme of Advanced Interdisciplinary Studies
(No.2023QYJC020).
Disclosure of Interests. The authors declare that they have no conflicts of interest.
References
1. Ortiz, A.P., et al., Elimination of cervical cancer in US Hispanic populations: Puerto
Rico as a case study. Preventive Medicine, 2021. 144: p. 106336.
2. Arbyn, M., et al., Estimates of incidence and mortality of cervical cancer in 2018: a
worldwide analysis. The Lancet Global Health, 2020. 8(2): p. e191-e203.
3. Ferlay, J., et al., Estimating the global cancer incidence and mortality in 2018:
GLOBOCAN sources and methods. International journal of cancer, 2019. 144(8): p.
1941-1953.
4. Sung, H., et al., Global cancer statistics 2020: GLOBOCAN estimates of incidence and
mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for
clinicians, 2021. 71(3): p. 209-249.
5. Fontham, E.T., et al., Cervical cancer screening for individuals at average risk: 2020
guideline update from the American Cancer Society. CA: a cancer journal for
clinicians, 2020. 70(5): p. 321-346.
6. Pathak, P., S. Pajai, and H. Kesharwani, A Review on the Use of the HPV Vaccine in
the Prevention of Cervical Cancer. Cureus, 2022. 14(9).
7. Singh, S.K. and A. Goyal, Performance analysis of machine learning algorithms for
cervical cancer detection, in Research Anthology on Medical Informatics in Breast and
Cervical Cancer. 2023, IGI Global. p. 347-370.
8. Institute, N.C. Cervical Cancer Screening. 2023 April 27, 2023 [cited 2024 January,
08]; Available from: https://www.cancer.gov/types/cervical/screening.
9. Dhawan, S., K. Singh, and M. Arora, Cervix image classification for prognosis of
cervical cancer using deep neural network with transfer learning. EAI Endorsed
Transactions on Pervasive Health and Technology, 2021. 7(27).
CerviNet: A Novel Approach for Pap-smear Images Analysis for Cervical Cancer 13
10. Akbar, H., et al. Optimizing AlexNet using Swarm Intelligence for Cervical Cancer
Classification. in 2021 International Symposium on Electronics and Smart Devices
(ISESD). 2021. IEEE.
11. Lavanya Devi, N. and P. Thirumurugan, Cervical cancer classification from pap smear
images using modified fuzzy C means, PCA, and KNN. IETE Journal of Research, 2022.
68(3): p. 1591-1598.
12. Lu, J., et al., Machine learning for assisting cervical cancer diagnosis: An ensemble
approach. Future Generation Computer Systems, 2020. 106: p. 199-205.
13. Chauhan, N.K. and K. Singh, Performance assessment of machine learning classifiers
using selective feature approaches for cervical cancer detection. Wireless Personal
Communications, 2022. 124(3): p. 2335-2366.
14. Chen, W., et al., Lightweight convolutional neural network with knowledge distillation
for cervical cells classification. Biomedical Signal Processing and Control, 2022. 71:
p. 103177.
15. Manna, A., et al., A fuzzy rank-based ensemble of CNN models for classification of
cervical cytology. Scientific Reports, 2021. 11(1): p. 14538.
16. Pal, A., et al., Deep metric learning for cervical image classification. IEEE Access,
2021. 9: p. 53266-53275.
17. Ghoneim, A., G. Muhammad, and M.S. Hossain, Cervical cancer classification using
convolutional neural networks and extreme learning machines. Future Generation
Computer Systems, 2020. 102: p. 643-649.
18. de Mesquita Junior, J.J., A.R. Backes, and O.M. Bruno. Pap-smear image
classification using randomized neural network based signature. in Progress in
Pattern Recognition, Image Analysis, Computer Vision, and Applications: 22nd
Iberoamerican Congress, CIARP 2017, Valparaíso, Chile, November 710, 2017,
Proceedings 22. 2018. Springer.
19. Win, K.P., et al., Computer-assisted screening for cervical cancer using digital image
processing of pap smear images. Applied Sciences, 2020. 10(5): p. 1800.
20. Deo, B.S., et al., CerviFormer: A pap smear‐based cervical cancer classification
method using cross‐attention and latent transformer. International Journal of Imaging
Systems and Technology, 2024. 34(2): p. e23043.
21. Plissiti, M.E., et al. SIPAKMED: A new dataset for feature and image based
classification of normal and pathological cervical cells in Pap smear images. in 2018
25th IEEE International Conference on Image Processing (ICIP). 2018. IEEE.
22. Jantzen, J., et al., Pap-smear benchmark data for pattern classification. Nature inspired
smart information systems (NiSIS 2005), 2005: p. 1-9.
23. Bisong, E., Building machine learning and deep learning models on Google cloud
platform. 2019: Springer.
24. Maurya, R., N.N. Pandey, and M.K. Dutta, VisionCervix: Papanicolaou cervical
smears classification using novel CNN-Vision ensemble approach. Biomedical Signal
Processing and Control, 2023. 79: p. 104156.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Cervical cancer is one of the primary causes of death in women. It should be diagnosed early and treated according to the best medical advice, similar to other diseases, to ensure that its effects are as minimal as possible. Pap smear images are one of the most constructive ways for identifying this type of cancer. This study proposes a cross‐attention‐based Transfomer approach for the reliable classification of cervical cancer in pap smear images. In this study, we propose the CerviFormer‐a model that depends on the Transformers and thereby requires minimal architectural assumptions about the size of the input data. The model uses a cross‐attention technique to repeatedly consolidate the input data into a compact latent Transformer module, which enables it to manage very large‐scale inputs. We evaluated our model on two publicly available pap smear datasets. For 3‐state classification on the Sipakmed data, the model achieved an accuracy of 96.67%. For 2‐state classification on the Herlev data, the model achieved an accuracy of 94.57%. Experimental results on two publicly accessible datasets demonstrate that the proposed method achieves competitive results when compared to contemporary approaches. The proposed method brings forth a comprehensive classification model to detect cervical cancer in pap smear images. This may aid medical professionals in providing better cervical cancer treatment, consequently, enhancing the overall effectiveness of the entire testing process.
Article
Full-text available
The main risk factor for invasive cervical carcinoma is persistent infection by the high-risk human papillomavirus (HPV). HPV is the most prevalent sexually transmitted infection (STI) and has been linked to 15 different cancers. Cervical cancer is one of the most frequent cancers among women, particularly in resource-limited countries. Cervical cancer is an HPV disease with the highest worldwide burden in resource-limited nations. With improved medical care and nationwide screening programmes, the mortality rate from cervical cancer has decreased in the past 40 years. Many developing nations have been shown to have inadequate knowledge and health-seeking practices, making proper awareness and immunisation programmes necessary. The best strategy to reduce the incidence of cervical cancer is through the administration of HPV vaccines along with routine cervical screening. The HPV vaccine is crucial for public health. Vaccinations against all HPV subtypes, namely, bivalent, quadrivalent, and nonavalent, are available. Financial issues are the main barrier to HPV vaccination. The framework for behavioural and social drivers of vaccination, which includes practical concerns, motivation, social processes, thoughts, and feelings, is widely used to uncover important aspects linked with HPV vaccination. The burden of cervical cancer due to HPV and the advantages of HPV vaccination are summarised in this review article.
Article
Full-text available
Worldwide, cervical cancer is the leading cause of death among women from cancer. The symptoms of this gynecological disease are difficult to recognize at early stage, especially in those countries that don’t have facility of screening programs. In diagnosis of cervical cancer, machine learning methods can be used to detect the malignous cancer cells at initial stage. The foremost apprehension in disease diagnosis involves data imbalance issue and non-uniform scaling in dataset. In this article, a prevalent oversampling approach Synthetic Minority Oversampling Technique along with fivefold cross-validation is being used on unscaled and scaled data to handle these issues. A promising comparison is been made among the performance of most prevalent machine learning (ML) classifiers such as Naive Bayes, Logistic Regression, K-Nearest Neighbor, Support Vector Machine (SVM), Linear Discriminant analysis, Multi-Layer Perceptron, Decision Tree (DT) and Random Forest (RF) on unscaled data and scaled data obtained by Min–Max scaling, Standard scaling and Normalization. RF, SVM and DT are the top three ML algorithms obtained in cervical cancer diagnosis for which optimization possibilities are explored with feature selection methods as Univariate feature selection and Recursive feature elimination (RFE). Overall performance of Random Forest predictor with RFE (RF-RFE) is superior to all others being implemented.
Article
Full-text available
Cervical cancer affects more than 0.5 million women annually causing more than 0.3 million deaths. Detection of cancer in its early stages is of prime importance for eradicating the disease from the patient’s body. However, regular population-wise screening of cancer is limited by its expensive and labour intensive detection process, where clinicians need to classify individual cells from a stained slide consisting of more than 100,000 cervical cells, for malignancy detection. Thus, Computer-Aided Diagnosis (CAD) systems are used as a viable alternative for easy and fast detection of cancer. In this paper, we develop such a method where we form an ensemble-based classification model using three Convolutional Neural Network (CNN) architectures, namely Inception v3, Xception and DenseNet-169 pre-trained on ImageNet dataset for Pap stained single cell and whole-slide image classification. The proposed ensemble scheme uses a fuzzy rank-based fusion of classifiers by considering two non-linear functions on the decision scores generated by said base learners. Unlike the simple fusion schemes that exist in the literature, the proposed ensemble technique makes the final predictions on the test samples by taking into consideration the confidence in the predictions of the base classifiers. The proposed model has been evaluated on two publicly available benchmark datasets, namely, the SIPaKMeD Pap Smear dataset and the Mendeley Liquid Based Cytology (LBC) dataset, using a 5-fold cross-validation scheme. On the SIPaKMeD Pap Smear dataset, the proposed framework achieves a classification accuracy of 98.55% and sensitivity of 98.52% in its 2-class setting, and 95.43% accuracy and 98.52% sensitivity in its 5-class setting. On the Mendeley LBC dataset, the accuracy achieved is 99.23% and sensitivity of 99.23%. The results obtained outperform many of the state-of-the-art models, thereby justifying the effectiveness of the same. The relevant codes of this proposed model are publicly available on GitHub.
Article
Full-text available
INTRODUCTION: Cervical cancer is the leading cancer among the other female cancers. It develops in the cervix of women. It takes decades in development thus can be preventable if diagnosed at an early stage. The cervix is classified into three types Type I/II/III. The efficacy of the treatment depends on the diagnosis of the right type of cervix. There is a thin line difference between the three types. Thus, identification of the right type of cervix becomes a difficult task for the health care providers too. To aid this problem, we proposed an algorithm based on the standard transfer learning approach used for building a model that classifies cervix images.OBJECTIVES: The objective of this study is to develop a cervical cancer predictive model based on deep learning and transfer learning techniques that will recognize and classify the cervix images into one of the classes (Type 1/Type2/Type3).METHODS: Techniques used for carrying out the experimental work includes deep learning and Transfer Learning. The three pertained models namely InceptionV3, ResNet50, and VGG19 are used for creating ConvNet that will classify the cervix images.RESULTS: The result of the experiment reveals that the Inception v3 model is performing better than Vgg19 and ResNet50 with an accuracy of 96.1% on the cervical cancer dataset. CONCLUSION: In the future, augmentation techniques can be employed to achieve better accuracy.
Article
Full-text available
Cervical cancer is caused by the persistent infection of certain types of the Human Papillomavirus (HPV) and is a leading cause of female mortality particularly in low and middle-income countries (LMIC). Visual inspection of the cervix with acetic acid (VIA) is a commonly used technique in cervical screening. While this technique is inexpensive, clinical assessment is highly subjective, and relatively poor reproducibility has been reported. A deep learning-based algorithm for automatic visual evaluation (AVE) of aceto-whitened cervical images was shown to be effective in detecting confirmed precancer (i.e. direct precursor to invasive cervical cancer). The images were selected from a large longitudinal study conducted by the National Cancer Institute in the Guanacaste province of Costa Rica. The training of AVE used annotation for cervix boundary, and the data scarcity challenge was dealt with manually optimized data augmentation. In contrast, we present a novel approach for cervical precancer detection using a deep metric learning-based (DML) framework which does not incorporate any effort for cervix boundary marking. The DML is an advanced learning strategy that can deal with data scarcity and bias training due to class imbalance data in a better way. Three different widely-used state-of-the-art DML techniques are evaluated- (a) Contrastive loss minimization, (b) N-pair embedding loss minimization, and, (c) Batch-hard loss minimization. Three popular Deep Convolutional Neural Networks (ResNet-50, MobileNet, NasNet) are configured for training with DML to produce class-separated (i.e. linearly separable) image feature descriptors. Finally, a K-Nearest Neighbor (KNN) classifier is trained with the extracted deep features. Both the feature quality and classification performance are quantitatively evaluated on the same data set as used in AVE. It shows that, unlike AVE, without using any data augmentation, the best model produced from our research improves specificity in disease detection without compromising sensitivity. The present research thus paves the way for new research directions for the related field.
Article
About half a million women in the world are affected by cervical cancer and about 0.3 million deaths occur per year due to cervical cancer. Cytologists perform Pap-smear tests to screen the Pap Smear images of the cervical cells. This manual screening is prone also to error. Therefore, an automated computer-aided detection systems have been proposed for the classification of cervical cancer cell images. In the proposed work, an ensemble of Vision Transformer network (ViT) and convolution neural network (CNN) has been proposed for the classification of cervical cell Pap smear images. ViT has been known for its minimal inductive bias and its competitive classification performance in comparison to the state-of-the-art convolution neural network. Fine-tuning large ViT network is a computationally intensive procedure; therefore, as an alternative to ViT-CNN approach, another transfer learning-based approach has also been proposed in which the features extracted from the pre-trained CNNs are combined and classified with the resource-efficient Long Short Term Memory (LSTM) network. Comparison between both the approaches has been made on the basis of their classification performance, test time, generalization ability and attention maps. Experimental results show that the ViT-CNN ensemble approach achieved 97.65% classification accuracy whereas the LSTM-based approach achieved 95.80% classification accuracy. ViT-CNN ensemble approach achieves better classification accuracy at the cost of the huge demand for computation since it takes more computational resources in terms of high amount of random access memory (RAM) in the graphical processing unit (GPU); whereas, the CNN-LSTM approach is less accurate and computationally cheaper.
Article
Cervical cancer is having the second-highest mortality rate next to breast cancer among women in developing countries. Early detection of the abnormality is the only way to prevent morbidity. As the decision about the abnormality of the cell is made manually by the traditional Pap smear test – the clinical test conducted for the detection of cervical cancer is more prone to false-negative and false-positive cases. This paper presents a novel approach for the automatic detection of cervical cancer using modified fuzzy C-means, extracting the geometrical and texture features, Principal Component Analysis (PCA), and classification. Modified fuzzy C-means show promising results in segmenting the input image into meaningful regions even when there is uncertainty. PCA is being performed to reduce the dimensionality of the data set by maintaining only the uncorrelated features thereby reducing the processing time of the algorithm. The classification of the pap smear images into normal and abnormal cells is being done by K Nearest Neighbour (KNN) classification with k-fold cross-validation and the result obtained in the proposed method is being compared with Fine Gaussian SVM, Ensemble Bagged trees, and Linear Discriminant. The efficiency of the proposed method is measured by calculating minimum accuracy, maximum accuracy, average accuracy, sensitivity, specificity, F1-score, and precision. The experimental results of the proposed method show impressive results with minimum accuracy 94.15%, maximum accuracy 96.28%, average accuracy 94.86%, sensitivity 97.96%, specificity 83.65%, F1-score 96.87%, and precision 96.31% for threefold cross-validation.
Article
Cervical cancer is among the most lethal human malignancies. Women in developing countries are overwhelmingly vulnerable to the cervical cancer because of the lack of medical resources. Although artificial intelligence technologies have witnessed some great progress in healthcare and medical practices, most studies in cervical cells classification focus on improving the accuracy without considering the resource limitations. To build a compact and effective model that meets design requirements for embedded devices, a lightweight convolutional neural network (CNN) architecture is chosen for establishing a highly efficient model with minimal parameters and calculations. Furthermore, knowledge distillation is utilized to improve the representation power of the lightweight CNN. This paper also investigates the importance of model selection in the proposed method. Experiments are conducted on the Herlev Pap smear dataset for the fine-grained 7-class classification tasks. The lightweight Xception, MobileNet and MobileNetV2 all achieved enhanced results (Xception from 71.39% to 72.25%, MobileNet from 62.12% to 64.14%, MobileNetV2 from 60.92% to 61.20%), and the best performing Xception model can achieve a comparable accuracy (72.25% compared to 73.58%) with only 40% of the model size of the powerful Inception-ResnetV2 model. Results shown that the proposed method can be used to develop a lightweight CNN model with improved accuracy, which is believed to be the first in this area of studying cervical cells classification tasks under limited resources. Furthermore, the compact model can potentially lead to a more economical and effective computer-aided system towards the diagnosis and prevention of the cervical cancer.