PreprintPDF Available

Early Diagnoses of Acute Lymphoblastic Leukemia Using YOLOv8 and YOLOv11 Deep Learning Models

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Thousands of individuals succumb annually to leukemia alone. This study explores the application of image processing and deep learning techniques for detecting Acute Lymphoblastic Leukemia (ALL), a severe form of blood cancer responsible for numerous annual fatalities. As artificial intelligence technologies advance, the research investigates the reliability of these methods in real-world scenarios. The study focuses on recent developments in ALL detection, particularly using the latest YOLO series models, to distinguish between malignant and benign white blood cells and to identify different stages of ALL, including early stages. Additionally, the models are capable of detecting hematogones, which are often misclassified as ALL. By utilizing advanced deep learning models like YOLOv8 and YOLOv11, the study achieves high accuracy rates reaching 98.8%, demonstrating the effectiveness of these algorithms across multiple datasets and various real-world situations.
Content may be subject to copyright.
Early Diagnoses of Acute Lymphoblastic Leukemia
Using YOLOv8 and YOLOv11 Deep Learning
Models
Alaa Awad, Mohamed Hegazy, Salah A. Aly
Faculty of Computing and Data Science, Badya University
October 6th City, Giza, Egypt
Abstract—Thousands of individuals succumb annually to
leukemia alone. This study explores the application of image
processing and deep learning techniques for detecting Acute
Lymphoblastic Leukemia (ALL), a severe form of blood can-
cer responsible for numerous annual fatalities. As artificial
intelligence technologies advance, the research investigates the
reliability of these methods in real-world scenarios. The study
focuses on recent developments in ALL detection, particularly
using the latest YOLO series models, to distinguish between
malignant and benign white blood cells and to identify different
stages of ALL, including early stages. Additionally, the models are
capable of detecting hematogones, which are often misclassified as
ALL. By utilizing advanced deep learning models like YOLOv8
and YOLOv11, the study achieves high accuracy rates reaching
98.8%, demonstrating the effectiveness of these algorithms across
multiple datasets and various real-world situations.
Index Terms—Lymphoblastic Leukemia, YOLOv8 and Yolov11
Deep Learning Models
I. INTRODUCTION
Today, we live in a technology-driven era where computer
science is harnessed to mimic human intelligence, enabling
more accurate and faster decision-making. As a result, many
researchers have tackled the challenge of leukemia detec-
tion through Artificial Intelligence, employing various deep
learning methodologies such as MobileNetV2 [1], attention
mechanisms [2], and YOLO [3]. Several datasets have been
used in different studies, including the ALL image dataset [4]
and the CNMC-2019 dataset [5].
Most research to date has relied on single-cell datasets to
train AI models. However, in real-world applications, models
need to handle multi-cell images and still maintain high
accuracy. This paper aims to bridge this gap by training the
proposed model on multi-cell samples.
In this study, we used image processing techniques such as
segmentation to prepare the dataset. Additionally, we applied
transfer learning and fine-tuning techniques on models like
YOLOv11 [6] and YOLOv8 [7] achieving results above 98%
accuracy.
The contributions of this research are summarized as fol-
lows:
1) Up to our knowledge, this is the first work to utilize
YOLOv11 for ALL blood cancer detection.
2) The integration of two datasets to improve generalization
across different sample types.
3) The classification of white blood cells as malignant or
benign is addressed
4) Successful detection of hematogones, which are often
misclassified as ALL.
5) A comparative analysis of our results with previous work
in the field.
II. RE LATE D WORK
Hosseini et al. [8] aimed to detect B-cell acute lymphoblas-
tic leukemia (B-ALL) and its subtypes using a deep CNN.
Talaat et al. [9] applied an attention mechanism to detect and
classify leukemia cells.
Yan [10] worked with the single-cell dataset CNMC-
2019 [5] to classify normal and cancerous white blood cells
using three models: YOLOv4, YOLOv8, and a CNN. Data
augmentation was applied to the CNN and YOLOv4 models.
The CNN model, featuring convolutional layers, max-pooling
layers, and ReLU activation, achieved 93% accuracy, while
YOLOv4 and YOLOv8 surpassed 95
Devi et al. [11] combined custom-designed and pretrained
CNN architectures to detect ALL in the augmented ALL image
dataset [4]. The custom CNN extracted hierarchical features,
while VGG-19 extracted high-level features and performed
classification, achieving 97.85% accuracy. In contrast, Khos-
rosereshki [12] used image processing and a Fuzzy Rule-
Based inference system for this task.
Rahmani et al. [13] utilized the C-NMC 2019 dataset,
applying preprocessing techniques such as grayscaling and
masking, followed by feature extraction via transfer learning
using VGG19, ResNet50, ResNet101, ResNet152, Efficient-
NetB3, DenseNet-121, and DenseNet-201. Feature selection
employed Random Forest, Genetic Algorithms, and Binary
Ant Colony Optimization. The classification, done through a
multilayer perceptron, achieved slightly above 90% accuracy.
Kumar et al. [14] focused on classifying different blood
cancers, including ALL and Multiple Myeloma, in white blood
cells. After preprocessing and augmentation, feature selection
was done using SelectKBest. Their model comprised two
blocks with convolutional and max-pooling layers, followed
by fully connected and classification layers, achieving 97.2%
accuracy.
Saikia et al. [15] introduced VCaps-Net, a fine-tuned
VGG16 combined with a capsule network for ALL detection.
Using the ALL-IDB1 dataset [16] and a private dataset,
VCaps-Net maintained spatial relationships in images through
capsule vectors, avoiding the loss often caused by MaxPooling,
and achieved 98.64% accuracy.
III. DATASETS AND DATA COLLECTIONS
Available datasets are divided into two types: single-cell
and multi-cell datasets. Single-cell datasets typically contain
images with a single white blood cell per image, whereas
multi-cell datasets depict multiple cells within each sample.
Since multi-cell datasets better represent real-life scenarios
when working with blood cells, we chose to focus on them.
The two datasets selected for this study are the Acute Lym-
phoblastic Leukemia (ALL) image dataset from Kaggle [4]
and ALL-IDB1 [16], both of which contain multiple white
blood cells per sample.
Table Isummarizes the statistics of the ALL image dataset,
which contains 3,256 images in total, divided into four cate-
gories: Benign, Early, Pre, and Pro. The benign class includes
Hematogones, a condition where lymphoid cells accumulate in
a pattern similar to ALL but are non-cancerous and generally
harmless. The dataset consists of 504 benign images and
2,752 malignant cells, further categorized into 985 early-stage
samples, 963 pre-stage samples, and 804 pro-phase samples.
TABLE I: Sample distribution per class for ALL image dataset
Class Samples Per Class
Benign 504
Early 985
Pre 963
Pro 804
Total 3256
On the other hand, Table II is affiliated with ALL-IDB1
dataset includes 108 images in total divided into 59 normal
blood samples and 49 cancerous ones. This balance between
normal and cancerous samples is crucial for the model to
effectively learn the distinguishing features of ALL cells. We
TABLE II: Sample distribution per class for ALL-IDB1 dataset
Class Samples Per Class
Normal 59
Cancer 49
Total 108
decided to merge the normal cells from ALL-IDB1 with the
benign cells from the ALL dataset into a single category called
Normal. Similarly, we combined the Early, Pre, and Pro classes
from the ALL dataset with the Cancer class from ALL-IDB1
into one category called Cancer. As a result, we focused on
two classes: Normal and Cancer. This approach exposes the
models to different datasets and various shapes of blast cells,
allowing for more practical detection and classification.
IV. MOD EL S AN D METHODOLOGIES
The implementation is divided into several phases, as
illustrated in Fig. 1. The first phase involves data preparation,
Fig. 1: The implementation process including the data prepa-
ration and models training and evaluation
where image segmentation techniques are applied to isolate
the relevant elements. Next, the pretrained YOLOv11s and
YOLOv8 models are loaded. These models enable data
augmentation and experimentation with various optimizers
and learning rates. In the final phase, the models are trained
to fine-tune the pretrained weights for the specific task at hand.
Dataset Preparation The dataset underwent preprocessing
to enhance model performance by removing irrelevant
elements like different backgrounds and unrelated blood
components. Image segmentation was applied using
OpenCV, converting images to the HSV color space
and creating a binary mask to isolate white blood cells.
To improve robustness and mitigate overfitting, various
augmentation techniques were implemented, including
geometric transformations, mosaic augmentation, random
erasing, and randaugment for diverse data variations.
Image Classification: The detection of blast cells can
be approached in various ways, with image classification
being one of the most common. Numerous deep learning
architectures have been developed to support this task, with
Convolutional Neural Networks (CNNs) being the most widely
used for image and video datasets. Models like VGG ,
AlexNet, and GoogleNet (Inception) [17] are all based on
CNNs. In this paper, we focus on two versions of YOLO:
YOLOv8 and YOLOv11.
To train and optimize model performance, we employed two
key techniques: transfer learning and fine-tuning.
First, transfer learning was applied using a pretrained
YOLOv8 model, imported after installing the Ultralytics pack-
age, and then training it on our custom segmented dataset.
Various tests were conducted using different optimizers and
hyperparameters, but the final results were based on 100
epochs of training, using the AdamW optimizer with a learning
rate of 0.000714.
Second, YOLOv11s, the latest version in the YOLO series,
was also trained on our custom dataset. We experimented with
the small version of the model to observe the performance
on different versions of YOLO. This gave us the chance to
understand the enhancements in the new version of the model.
Performance Metrics: Accuracy is an overall indicator
on how well the model performs taking into consideration
the number of correctly identified samples out of all the
given samples. This is represented by the summation of true
positives and true negatives divided by the total number of
examples consisting of True Positive(TP), True Negative(TN),
False Positive (FP) and False Negative (FN) as expressed in
Equation 1:
Accuracy =T P +T N
T P +T N +F P +F N (1)
Presented in Equation 2, the sample precision which is iden-
tified by the ratio of correctly classified instances to the total
number of classified instances.
P recision =T P
T P +F P (2)
Recall, or Sensitivity, is calculated as the ratio of correctly
identified instances to the total number of instances, as de-
scribed in Equation 3.
Recall =T P
T P +F N (3)
Another significant metric that contributed to our results is
f1 score. It is obtained by calculating the harmonic mean of
precision and recall, as illustrated in Equation 4.
F1Score = 2
P recision Recall
P recision +Recall (4)
In addition to the previous indicators, we calculated the
specificity using the formula in Equation 5.It refers to the
proportion of correctly identified negative instances among
all actual negative cases. It reflects the model’s ability to
accurately classify instances from the opposite disease classes.
Specif icity =T N
T N +F P (5)
V. EXP ER IM EN TAL RES ULT S
In this section, the performance of our different trained
models is evaluated using the different metrics mentioned in
the previous section. Starting with YOLOv11s were trained
for 100 epochs, which achieved 97.4% training accuracy and
98.8% testing accuracy, while YOLOv11s has achieved a
slightly higher testing accuracy of 98.8%.
The accuracy graph for the nano version of YOLOv11
shown in Fig. 2b demonstrates improvement in the accuracy’s
progress with some fluctuations at the beginning. These vari-
ations decrease gradually as the number of epochs increases
until the graph curve becomes more stable. It can be seen that
the training and validation losses were declining steadily as
the training advanced in Figures. 2a and 6a.
(a) Train loss (b) Accuracy
Fig. 2: Performance metrics for YOLOv11s. (a) Train loss, (b)
Accuracy, and (c) Validation loss.
The confusion matrix in Fig. 3offers valuable insights
into the YOLOv11s model’s performance, highlighting which
classes are accurately detected and where errors occur. This
analysis helps identify areas for improvement to enhance the
model’s effectiveness. The matrix indicates that the model
achieved high accuracy in detecting cancer across all stages,
though it did misclassify 0.06% of healthy white blood cells as
cancerous. Overall, the analysis validates the model’s strong
performance while pinpointing specific issues that require
optimization.
Fig. 3: Normalized confusion matrix for YOLOv11s.
Next, we analyze the results from YOLOv8 shown in figures
4a,4b and 6b in which the accuracy on the training dataset
was 98%, and it peaked at 98.4 when evaluated on the testing
dataset. Compared to YOLOv11, YOLOv8’s small version
achieves a slightly lower accuracy. The accuracy, training and
validation loss graphs of YOLOv8 shown below follow similar
patterns as those of YOLOv11 with the later having more
stable curves.
(a) Train loss (b) Accuracy
Fig. 5: Confusion matrix for YOLOv8s.
(a) YOLOv11 validation loss (b) YOLOv8 validation loss
Fig. 6: The validation losses for both (a) YOLOv11 and (b)
YOLOv8.
VI. CONCLUSION AND COMPARISON
In conclusion, the integration of AI in the medical field
is a massive step in the advancement of the health system
and services provided to patients. This study was able to
detect the presence of ALL in blood even at early stages
using YOLOv11 and YOLOv8. The performance of YOLOv11
proved to be slightly better than that of YOLOv8 achieving
better accuracies which can be significant in cancer diagnosis.
Table III demonstrates a comparison between our findings
and some of the previous studies.
REFERENCES
[1] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen,
“Mobilenetv2: Inverted residuals and linear bottlenecks, 2019. [Online].
Available: https://arxiv.org/abs/1801.04381
TABLE III: Comparison of Different Approaches for Detect-
ing Acute Lymphoblastic Leukemia (ALL)
Study Methodology Accuracy Dataset
Yan [10] YOLOv4,
YOLOv8, and
CNN
CNN: 93%,
YOLOv4:
>95%,
YOLOv8:
>95%
CNMC-2019 [5]
Devi et al. [11] Custom + pre-
trained CNN
97.85% ALL dataset [4]
Saikia et al. [15] VCaps-Net 98.64% ALL-IDB1 [16]
Kumar et al. [14] Custom CNN 97.2% Custom dataset
Our study YOLOv11s and
YOLOv8s
YOLOv11s:
98.8%,
YOLOv8s:
98.4%
ALL-IDB1 [16]
+ ALL dataset
[4]
[2] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N.
Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need,” 2023.
[Online]. Available: https://arxiv.org/abs/1706.03762
[3] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look
once: Unified, real-time object detection,” in 2016 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2016.
[4] M. Ghaderzadeh, M. Aria, A. Hosseini, F. Asadi, D. Bashash, and
H. Abolghasemi, “A fast and efficient cnn model for b-all diagnosis
and its subtypes classification using peripheral blood smear images,”
Int. J. Intell. Syst., 2021.
[5] S. Mourya, S. Kant, P. Kumar, A. Gupta, and R. Gupta, “All challenge
dataset of isbi 2019 (c-nmc 2019) (version 1),” 2019. [Online].
Available: https://doi.org/10.7937/tcia.2019.dc64i46r
[6] Ultralytics, “Yolov11 - key features,” 2024. [Online]. Available:
https://docs.ultralytics.com/models/yolo11/#key-features
[7] R. Varghese and S. M., “Yolov8: A novel object detection algorithm
with enhanced performance and robustness,” in 2024 International
Conference on Advances in Data Engineering and Intelligent Computing
Systems (ADICS), 2024.
[8] A. Hosseini et al., “A mobile application based on efficient lightweight
cnn model for classification of b-all cancer from non-cancerous cells: A
design and implementation study, Informatics in Medicine Unlocked,
vol. 39, 2023.
[9] F. M. Talaat and S. A. Gamel, “A2m-leuk: attention-augmented algo-
rithm for blood cancer detection in children,” Neural Computing and
Applications, vol. 35, no. 24, 2023.
[10] E. Yan, “Detection of acute myeloid leukemia using deep learning
models based systems,” in IFMBE Proceedings, 2024, pp. 421–431.
[11] J. R. Devi, P. S. Kadiyala, S. Lavu, N. Kasturi, and L. Kosuri,
“Enhancing acute lymphoblastic leukemia classification with a rapid and
effective cnn model, 2024.
[12] M. A. Khosrosereshki and M. B. Menhaj, “A fuzzy based classifier
for diagnosis of acute lymphoblastic leukemia using blood smear image
processing,” in 2017 5th Iranian Joint Congress on Fuzzy and Intelligent
Systems (CFIS), 2017.
[13] A. M. Rahmani et al., “A diagnostic model for acute lymphoblastic
leukemia using metaheuristics and deep learning methods,” 2024.
[Online]. Available: https://arxiv.org/abs/2406.18568
[14] D. Kumar, N. Jain, A. Khurana, S. Mittal, S. C. Satapathy, R. Senkerik,
and J. D. Hemanth, “Automatic detection of white blood cancer from
bone marrow microscopic images using convolutional neural networks,
IEEE Access, vol. 8, 2020.
[15] R. Saikia, A. Sarma, K. M. Singh, and S. S. Devi, “Vcaps-net: Fine-
tuned vgg16 with capsule network for acute lymphoblastic leukemia
detection on a diverse dataset, in 2024 6th International Conference on
Energy, Power and Environment (ICEPE), 2024.
[16] A. Genovese, V. Piuri, K. N. Plataniotis, and F. Scotti, “Dl4all: Multi-
task cross-dataset transfer learning for acute lymphoblastic leukemia
detection,” IEEE Access, vol. 11, pp. 65222–65 237, 2023.
[17] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov,
D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with
convolutions, 2014. [Online]. Available: https://arxiv.org/abs/1409.4842
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Methods for the detection of Acute Lymphoblastic (or Lymphocytic) Leukemia (ALL) are increasingly considering Deep Learning (DL) due to its high accuracy in several fields, including medical imaging. In most cases, such methods use transfer learning techniques to compensate for the limited availability of labeled data. However, current methods for ALL detection use traditional transfer learning, which requires the models to be fully trained on the source domain, then fine-tuned on the target domain, with the drawback of possibly overfitting the source domain and reducing the generalization capability on the target domain. To overcome this drawback and increase the classification accuracy that can be obtained using transfer learning, in this paper we propose our method named “Deep Learning for Acute Lymphoblastic Leukemia” (DL4ALL), a novel multi-task learning DL model for ALL detection, trained using a cross-dataset transfer learning approach. The method adapts an existing model into a multi-task classification problem, then trains it using transfer learning procedures that consider both source and target databases at the same time, interleaving batches from the two domains even when they are significantly different. The proposed DL4ALL represents the first work in the literature using a multi-task cross-dataset transfer learning procedure for ALL detection. Results on a publicly-available ALL database confirm the validity of our approach, which achieves a higher accuracy in detecting ALL with respect to existing methods, even when not using manual labels for the source domain.
Article
Full-text available
Leukemia is a malignancy that affects the blood and bone marrow. Its detection and classification are conventionally done through labor-intensive and specialized methods. The diagnosis of blood cancer in children is a critical task that requires high precision and accuracy. This study proposes a novel approach utilizing attention mechanism-based machine learning in conjunction with image processing techniques for the precise detection and classification of leukemia cells. The proposed attention-augmented algorithm for blood cancer detection in children (A2M-LEUK) is an innovative algorithm that leverages attention mechanisms to improve the detection of blood cancer in children. A2M-LEUK was evaluated on a dataset of blood cell images and achieved remarkable performance metrics: Precision = 99.97%, Recall = 100.00%, F1-score = 99.98%, and Accuracy = 99.98%. These results indicate the high accuracy and sensitivity of the proposed approach in identifying and categorizing leukemia, and its potential to reduce the workload of medical professionals and improve the diagnosis of leukemia. The proposed method provides a promising approach for accurate and efficient detection and classification of leukemia cells, which could potentially improve the diagnosis and treatment of leukemia. Overall, A2M-LEUK improves the diagnosis of leukemia in children and reduces the workload of medical professionals.
Article
Full-text available
Background: B-cell acute lymphoblastic leukemia (B-ALL) is one of the most widespread cancers, and its definitive diagnosis demands invasive and costly diagnostic tests with side effects for patients. Access to definitive diagnostic equipment for BALL is limited in many geographical areas. Blood microscopic examination has always been a major BALL screening and diagnosis technique. Still, the examination of blood microscopically by laboratory personnel and hematologists is riddled with disadvantages. Meanwhile, AI techniques can achieve remarkable results in blood microscopy image analysis. The present study aimed to design and implement a well-tuned based on deep CNN to detect BALL cases from hematogones and then determine the BALL subtype. Methods: Based on the well-designed and tuned model, a mobile application was also designed for screening BALL from non-BALL cases. In the modeling stage, a unique segmentation technique was used for color thresh-olding in the color LAB space. By applying the K-means clustering algorithm, and adding a mask to the clustered images, a segmented image was obtained to eliminate unnecessary components. After comparing the efficiency of three notable architectures of lightweight CNN (EfficientNetB0, MobileNetV2, and NASNet Mobile), the most efficient model was selected, and the proposed model was accordingly configured and tuned. Results: The proposed model achieved an accuracy of 100%. Finally, a mobile application was designed based on this state-of-the-art model. In the real laboratory setting, the mobile application based on the proposed model classified BALL cases from other classes and achieved a sensitivity and specificity of 100% as a robust screening tool. Conclusions: The application that relies on preprocessing and DL algorithms can be used as a powerful screening tool by hematologists and clinical specialists to ignore or minimize unnecessary bone marrow biopsy cases and decrease the BALL diagnosis time.
Article
Full-text available
The definitive diagnosis of acute lymphoblastic leukemia (ALL), as a highly prevalent cancer, requires invasive, expensive, and time-consuming diagnostic tests. ALL diagnosis using peripheral blood smear (PBS) images plays a vital role in the initial screening of cancer from non-cancer cases. The examination of these PBS images by laboratory users is riddled with problems such as diagnostic error because the nonspecific nature of ALL signs and symptoms often leads to misdiagnosis. Herein, a model based on deep convolutional neural networks (CNNs) is proposed to detect ALL from hematogone cases and then determine ALL subtypes. In this paper, we build a publicly available ALL data set, comprised of 3562 PBS images from 89 patients suspected of ALL, including 25 healthy individuals with a benign diagnosis (hematogone) and 64 patients with a definitive diagnosis of ALL subtypes. After color thresholding-based segmentation in the HSV color space by designing a two-channel network, 10 well-known CNN architectures (EfficientNet, MobileNetV3, VGG-19, Xception, InceptionV3, ResNet50V2, VGG-16, NASNetLarge, InceptionResNetV2, and DenseNet201) were employed for feature extraction of different data classes. Of these 10 models, DenseNet201 achieved the best performance in diagnosis and classification. Finally, a model was developed and proposed based on this state-of-the-art technology. This deep learning-based model attained an accuracy, sensitivity, and specificity of 99.85, 99.52, and 99.89%, respectively. The proposed method may help to distinguish ALL from benign cases. This model is also able to assist hematologists and laboratory personnel in diagnosing ALL subtypes and thus determining the treatment protocol associated with these subtypes. The proposed data set is available at https://www.kaggle.com/mehradaria/leukemia and the implementation (source code) of the proposed method is made publicly available at https://github.com/MehradAria/ALL-Subtype-Classification.
Article
Full-text available
Leukocytes, produced in the bone marrow, make up around one percent of all blood cells. Uncontrolled growth of these white blood cells leads to the birth of blood cancer. Out of the three different types of cancers, the proposed study provides a robust mechanism for the classification of Acute Lymphoblastic Leukemia (ALL) and Multiple Myeloma (MM) using the SN-AM dataset. Acute lymphoblastic leukemia (ALL) is a type of cancer where the bone marrow forms too many lymphocytes. On the other hand, Multiple myeloma (MM), a different kind of cancer, causes cancer cells to accumulate in the bone marrow rather than releasing them into the bloodstream. Therefore, they crowd out and prevent the production of healthy blood cells. Conventionally, the process was carried out manually by a skilled professional in a considerable amount of time. The proposed model eradicates the probability of errors in the manual process by employing deep learning techniques, namely convolutional neural networks. The model, trained on cells’ images, first pre-processes the images and extracts the best features. This is followed by training the model with the optimized Dense Convolutional neural network framework (termed DCNN here) and finally predicting the type of cancer present in the cells. The model was able to reproduce all the measurements correctly while it recollected the samples exactly 94 times out of 100. The overall accuracy was recorded to be 97.2%, which is better than the conventional machine learning methods like Support Vector Machine (SVMs), Decision Trees, Random Forests, Naive Bayes, etc. This study indicates that the DCNN model’s performance is close to that of the established CNN architectures with far fewer parameters and computation time tested on the retrieved dataset. Thus, the model can be used effectively as a tool for determining the type of cancer in the bone marrow.
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.