ArticlePDF Available

NBCDC‐YOLOv8: A new framework to improve blood cell detection and classification based on YOLOv8

Wiley
IET Computer Vision
Authors:

Abstract and Figures

In recent years, computer technology has successfully permeated all areas of medicine and its management, and it now offers doctors an accurate and rapid means of diagnosis. Existing blood cell detection methods suffer from low accuracy, which is caused by the uneven distribution, high density, and mutual occlusion of different blood cell types in blood microscope images, this article introduces NBCDC‐YOLOv8: a new framework to improve blood cell detection and classification based on YOLOv8. Our framework innovates on several fronts: it uses Mosaic data augmentation to enrich the dataset and add small targets, incorporates a space to depth convolution (SPD‐Conv) tailored for cells that are small and have low resolution, and introduces the Multi‐Separated and Enhancement Attention Module (MultiSEAM) to enhance feature map resolution. Additionally, it integrates a bidirectional feature pyramid network (BiFPN) for effective multi‐scale feature fusion and includes four detection heads to improve recognition accuracy of various cell sizes, especially small target platelets. Evaluated on the Blood Cell Classification Dataset (BCCD), NBCDC‐YOLOv8 obtains a mean average precision (mAP) of 94.7%, and thus surpasses the original YOLOv8n by 2.3%.
This content is subject to copyright. Terms and conditions apply.
Received: 23 May 2024
-
Revised: 12 December 2024
-
Accepted: 2 January 2025
-
IET Computer Vision
DOI: 10.1049/cvi2.12341
ORIGINAL RESEARCH
NBCDCYOLOv8: A new framework to improve blood cell
detection and classication based on YOLOv8
Xuan Chen
1
|Linxuan Li
1
|Xiaoyu Liu
1
|Fengjuan Yin
1
|Xue Liu
2
|
Xiaoxiao Zhu
3
|Yufeng Wang
1
|Fanbin Meng
1
1
School of Medical Information Engineering, Jining
Medical University, Rizhao, China
2
College of Basic Medicine, Jining Medical University,
Jining, China
3
Respiratory Medicine Department, Rizhao
Traditional Chinese Medicine Hospital, Rizhao,
China
Correspondence
Yufeng Wang and Fanbin Meng, School of Medical
Information Engineering, Jining Medical University,
Rizhao, China.
Email: wyf@mail.jnmc.edu.cn and drmeng@mail.
jnmc.edu.cn
Funding information
Innovation and Entrepreneurship Training Program
for College Students, Grant/Award Numbers:
202210443002, 202210443003, S202310443006,
cx2022044z, cx2023094z; Jining Medical University
Classroom Teaching Reform Research Project,
Grant/Award Number: 2022KT012
Abstract
In recent years, computer technology has successfully permeated all areas of medicine
and its management, and it now offers doctors an accurate and rapid means of diagnosis.
Existing blood cell detection methods suffer from low accuracy, which is caused by the
uneven distribution, high density, and mutual occlusion of different blood cell types in
blood microscope images, this article introduces NBCDCYOLOv8: a new framework to
improve blood cell detection and classication based on YOLOv8. Our framework in-
novates on several fronts: it uses Mosaic data augmentation to enrich the dataset and add
small targets, incorporates a space to depth convolution (SPDConv) tailored for cells
that are small and have low resolution, and introduces the MultiSeparated and
Enhancement Attention Module (MultiSEAM) to enhance feature map resolution.
Additionally, it integrates a bidirectional feature pyramid network (BiFPN) for effective
multiscale feature fusion and includes four detection heads to improve recognition ac-
curacy of various cell sizes, especially small target platelets. Evaluated on the Blood Cell
Classication Dataset (BCCD), NBCDCYOLOv8 obtains a mean average precision
(mAP) of 94.7%, and thus surpasses the original YOLOv8n by 2.3%.
KEYWORDS
bidirectional feature pyramid network, cell detection, mosaic data augmentation, multiseparated and
enhancement attention module, space to depth convolution
1
|
INTRODUCTION
Blood cell detection and classication are pivotal in clinical
diagnosis and crucial for the subsequent identication and
treatment of a wide array of diseases. Cell counting [1] is an
essential process for evaluating the number of various cell
types in a patient's blood, and blood cell detection and clas-
sication are primarily used for this purpose. The complete
blood count (CBC) is a standard blood test that assesses an
individual's overall health and aids in diagnosing a wide range
of conditions. This test specically gauges the primary cell
types found in the blood: red blood cells, white blood cells, and
platelets [2]. The red blood cell count (RBC) reects the ca-
pacity for oxygen transport, the white blood cell count (WBC)
reects the condition of the immune system, and the platelet
count relates to coagulation abilities. Additionally, the CBC
encompasses haemoglobin concentration, red blood cell vol-
ume distribution width (RDW), mean red blood cell volume
(MCV), and other essential markers [3]. These components
assist medical professionals in comprehending patients' blood
proles and potential health concerns [4, 5]. Traditional blood
cell counting relies on manual microscopic examination [6],
which poses challenges such as operational complexity and
increased susceptibility to statistical and observational errors
when dealing with large volumes of samples [7].
In response to these challenges, researchers have been
exploring the utilisation of deep learning (DL) technology in
medical image analysis and seeking new automated solutions
[8, 9]. In 2021, Siraj Khan et al. comprehensively examined the
use of traditional machine learning (TML) and DL in
This is an open access article under the terms of the Creative Commons AttributionNonCommercialNoDerivs License, which permits use and distribution in any medium, provided the
original work is properly cited, the use is noncommercial and no modications or adaptations are made.
© 2025 The Author(s). IET Computer Vision published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.
IET Comput. Vis. 2025;e12341. wileyonlinelibrary.com/journal/cvi2
-
1 of 14
https://doi.org/10.1049/cvi2.12341
distinguishing white blood cells from blood smear images [10].
They underscored the efcacy and precision of these tech-
nologies in clinical diagnosis and especially in detecting hae-
matological diseases such as leukaemia. Similarly, in 2022,
Pradeep Kumar Das et al. assessed the contributions of DL
and ML in identifying acute lymphoblastic leukaemia (ALL)
[11]. They illustrated the promise of convolutional neural
networks (CNNs) in analysing intricate blood cell images and
suggested methods to further enhance detection accuracy.
Current methodologies for detecting and classifying blood cells
can be divided into two principal approaches: TML and DL
approaches. Traditional machine learning methods rely on
handcrafted feature extraction techniques and algorithms such
as SVM and kNN. In contrast, DL approaches utilise models
such as CNNs that automatically learn features from images,
which results in signicantly enhanced detection accuracy.
Rizki Firdaus Mulya et al. used the InceptionV3 model to
recognise white blood cell images [12] and proposed a new
method for leukaemia classication detection, which signi-
cantly improved early diagnosis accuracy. Subsequently, in
2024, Irfan Sadiq Rahat et al. evaluated the performance of
several DL models on blood smear images [13] in terms of
precision of automated detection and classication. Similarly,
Dongdong Zhang et al. proposed a cell counting algorithm
utilising YOLOv3 and image density estimation [14], thereby
further enhancing the precision in detection and counting of
red and white blood cells.
In blood smear images, the distribution of white blood
cells is sparse than that of red blood cells, which makes them
easier to count. Conversely, red blood cells pose challenges for
detection and counting due to their dense arrangement and
tendency to overlap and adhere to each other [15, 16]. Addi-
tionally, platelets are small targets in a complex background
and hence are often more difcult to count. Wangxinjun
Cheng et al. pointed out that traditional image processing
methods face signicant limitations when dealing with the
complex environment of blood cells [17] whereas DL methods
have signicantly improved detection accuracy thanks to their
automatic feature learning; this improvement is especially
observed in the pathological detection of diseases, such as
leukaemia and malaria. Therefore, we improve YOLOv8 to
develop a new blood cell detection and classication frame-
work (NBCDCYOLOv8). Specically, we aim to enhance its
performance in complex data scenarios to ensure model
robustness and detection accuracy.
The primary contributions of this article are listed below.
Development of NBCDCYOLOv8: We develop a cus-
tomised YOLOv8 framework tailored for blood cell
detection that addresses key challenges, such as cell overlap,
uneven distribution, and high density.
Model Enhancements: The introduction of SPDConv,
MultiSEAM, and bidirectional feature pyramid network
(BiFPN) improves the detection of small and overlapping
blood cells, thereby enhancing overall accuracy.
StateoftheArt Performance: The proposed model ach-
ieves 94.7% mean average precision (mAP) on the mAP
dataset and thus surpasses existing methods in detection
precision.
Realtime Application: With fast inference speed, the pro-
posed model is well suited for realtime blood cell analysis
in medical diagnostics.
2
|
PROPOSED METHOD
Currently, the mainstream algorithms for target detection fall
into two categories. First, we have twostage algorithms, which
are represented by RCNN [18] and Faster RCNN [19]. The
basic concept behind such approaches is to generate a set of
sparse candidate boxes through either a heuristic method (e.g. a
selective search) or a CNN network (e.g. an RPN). Subse-
quently, these candidate boxes are classied and rened. Sec-
ond, we have onestage methods, which are represented by
You Only Look Once (YOLO) [20–22] and SSD [23]. The
fundamental idea here is to uniformly and densely sample from
different areas in an image. Various proportions and aspect
ratios can be employed during the sampling process, and then a
CNN can be utilised to extract features and directly perform
classication and renement. This onestep process results in
faster detection and better accuracy than the twostage
detection.
Given the need for fast and efcient blood cell detection
and classication in clinical settings, the present study opts for
a onestage algorithm. Specically, we choose YOLOv8 due to
its ability to balance detection speed and accuracy particularly
in complex scenarios involving overlapping and dense cell
structures.
2.1
|
The network structure of YOLOv8
The YOLO series algorithms have garnered considerable
attention due to their exceptional efciency and accuracy. In
2023, Ultralytics unveiled their most recent iteration of YOLO:
YOLOv8.
YOLOv8 enhances the connection of the backbone feature
extraction network to the detection head. To enhance feature
representation, especially when handling complex scenarios
and in small object detection. Additionally, YOLOv8 integrates
multiscale feature fusion technology by using an improved
feature pyramid network (FPN). This allows the model to
better combine features from different layers, thereby
improving localisation accuracy and classication ability. In
contrast, YOLOv5 relies on CSPDarknet53 as the backbone
and uses PANet for feature fusion. While YOLOv5 performs
well in terms of speed and accuracy, its multiscale feature
fusion is less efcient than YOLOv8's new architecture.
In summary, YOLOv8 has undergone signicant optimi-
sations in its network architecture that benet feature extrac-
tion and multiscale feature fusion. These improvements are
especially crucial for small object detection, such as platelet
detection tasks, where YOLOv8 can identify and differentiate
small and complex platelets more effectively than its
2 of 14
-
CHEN ET AL.
predecessors, demonstrating signicantly improved detection
accuracy. The YOLOv8 model maintains high detection ac-
curacy and speed, which makes it suitable for realtime appli-
cations such as medical observation. Additionally, it exhibits
excellent robustness and scalability, which is why we adopt it as
the foundational model for further improvements.
Specically, as shown in Figure 1, the input terminal pre-
processes the input images through data enhancement, adap-
tive image scaling, and grayscale lling. Within the backbone
network, image features are extracted using convolution and
pooling techniques implemented through convolutional layer
(Conv), cross stage partial bottleneck with two convolutions
(C2f), and spatial pyramid pooling fusion (SPPF) structures.
The neck terminal is constructed based on a path aggregation
network (PAN), which merges feature maps with varying
scaling scales through upsampling, downsampling, and
feature stitching. The output utilises a decoupled head struc-
ture to separate the classication and regression processes.
This includes positive and negative sample matching, as well as
loss calculation. The YOLOv8 network employs the Task
aligned Assigner method [24] to assign weights to the classi-
cation score and regression score for positive sample
matching. The loss calculation encompasses both classication
loss and regression loss. To compute the classication loss, the
algorithm then uses binary crossentropy while regression loss
is calculated using distribution focal loss [25] (DFL) and
complete intersection over union loss functions.
2.2
|
Improved NBCDCYOLOv8
We select YOLOv8n as the baseline for our research due to its
low parameter count and fast detection speed. Figure 2shows
NBCDCYOLOv8's architecture featuring the new SPDConv
block in its backbone that replaces traditional convolution and
pooling layers. Additionally, it incorporates BiFPN at the neck,
thereby enhancing the traditional FPN by enabling bi
directional information ow. NBCDCYOLOv8 also uses
MultiSEAM to boost recognition accuracy for small target
platelets.
2.2.1
|
Mosaic data augmentation
In the detection and classication of blood cells, there are
problems such as limited data sets and imbalanced categories,
which lead to poor generalisability, low accuracy, and the model
being prone to overtting. We can enhance the diversity of
samples and improve background complexity to enhance the
generalisability of DL models through various data augmen-
tation techniques [26]. In the case of the YOLOv8 network,
when dealing with datasets with insufcient samples, image
enhancement processing is applied during the model training
stage. The primary data augmentation method utilised by
YOLOv8 is Mosaic data augmentation [27]. The main idea is to
concatenate four blood cell images onto one image as a
training sample. Consequently, YOLOv8 can seamlessly
process a new image consisting of four blood cell images as a
batch input during training. These operations help alleviate the
overtting issues caused by the limited number of samples,
decrease GPU consumption, and enhance the robustness and
accuracy of the network.
Mosaic was initially introduced in YOLOv4 and is an
extension of the CutMix data augmentation algorithm [28]. Since
our main focus of blood cell detection and recognition is colour,
the original Mosaic data enhancement method is deemed inap-
propriate. Thus, we remove the gamut conversion module from
Mosaic. Moreover, to enhance the sample pool and enhance the
model's generalisability and robustness, we assemble four images
after ipping and zooming Figure 3illustrates the comparison
between the original mosaic data enhancement approach and the
enhanced one. Mosaic data augmentation effectively addresses
the challenges stemming from limited dataset samples and un-
even distribution of samples.
2.2.2
|
SPDConv
In tasks demanding exceptional precision and sensitivity to
structural nuances, the performance of standard CNNs may
fall short. In contrast, SPDConv [29] excels in extracting
pertinent features more efciently. In the context of blood cell
detection, SPDConv emerged as a viable alternative to tradi-
tional convolution and pooling layers while maintaining a
similar parameter count. Its primary objective is to enhance the
detection accuracy of lowresolution blood cell images and
minuscule platelet objects. By leveraging its procient matrix
processing capabilities, SPDConv holds promise in facilitating
disease diagnosis and progress monitoring. This innovative
approach comprises two integral components: a spaceto
depth layer and a nonstrided convolution layer. The imple-
mentation of SPDConv within the YOLOv8 model, coupled
with its validation on the BCCD dataset, underscores its ef-
cacy in blood cell detection.
For a feature map Xof any size (W,H,C) (where W=H),
its subfeature map sequence slice is shown in formula (1).
f0;0¼X½0:W:scale;0:W:scale
f1;0¼X½1:W:scale;0:W:scale
f0;1¼X½0:W:scale;1:W:scale
f1;1¼X½1:W:scale;1:W:scale
ð1Þ
For a given feature map Xof any size, submap fx;ycon-
sists of all iþxand iþydivisible by scale in X(i,j). The
operation of SPDConv is shown in Figure 4. First, the feature
map input as WHC is sampled, and four feature maps
with the size of W/2 H/2 C are obtained. Then, a feature
map with a size of W/2 H/2 4C is obtained by splicing.
Next, the dimension is reduced by a 1 1 convolution, and
the correlation between the feature maps is constructed to
obtain a W/2 H/2 C feature map. It can be seen that
SPDConv has higher information retention than the tradi-
tional convolution, and therefore it can effectively improve
feature extraction for small targets.
CHEN ET AL.
-
3 of 14
FIGURE 1 YOLOv8 network structure illustrating four main modules: an input terminal, backbone network, neck terminal, and output terminal.
FIGURE 2 NBCDCYOLOv8 network structure. Based on the YOLOv8 framework, the improved NBCDCYOLOv8 algorithm framework integrates a
bidirectional feature pyramid network (BiFPN) structure, uses SPD in the backbone, introduces MultiSEAM in the neck, and nally adds four detection heads to
the output terminal.
4 of 14
-
CHEN ET AL.
2.2.3
|
Bidirectional feature pyramid network
fused to middle YOLOv8
The neck of YOLOv8 contains multiscale feature fusion
technology. Initially, the neck downsamples the input and then
upsamples it. We replaced the C3 module with a C2f module,
which combines the feature maps from various stages of the
backbone to enhance representation. Specically, YOLOv8's
neck consists of an SPPF module, a PAA module, and two
PAN modules. However, in the aforementioned multiscale
feature fusion, the feature information at each scale is incon-
sistent. Improved FPNs such as PANet and NASFPN intro-
duce a signicant amount of computation or fail to effectively
fuse the features, thereby making it difcult to fully utilise the
features across different scales. To overcome this limitation,
our algorithm design is inspired by the efcient bidirectional
crossscale connection and the structure of the weighted
feature fusion BiFPN [30].
YOLOv8n has a PAN path aggregation structure, as
shown in Figure 5a. Although PAN utilises both bottomup
and topdown feature transfer, it is only capable of merging
features from two levels. In contrast, BiFPN employs a
UNetlike upanddown sampling structure, which permits
the rapid integration of features across various levels.
Figure 5b shows the network structure of BiFPN. This
structure not only enhances feature fusion through multi
level integration but also utilises the upanddown sam-
pling structure to reduce parameter calculation, and this
results in efcient image detection across different scales.
It is evident that PAN has a twoway network structure
that enables information transmission vertically between the
top and bottom layers and horizontally within layers. However,
this information transmission mode remains relatively
simplistic and can only capture the characteristics of immediate
neighbouring nodes. In contrast, the BiFPN network structure
employed in this study can interact comprehensively with in-
formation from both horizontal and vertical information ow
paths. Through the introduction of crosslayer jump connec-
tions, it further facilitates the exchange of information across
multiple paths. This integration leads to more comprehensive
feature map fusion, which then results in richer and more
accurate feature expressions. By combining bidirectional scale
connections and weighted features, this structure achieves a
better balance between accuracy and efciency.
Taking the P6 feature fusion depicted in Figure 5as an
illustration, we mathematically represent the structure of its
weighted bidirectional pyramid network in formula (2).
Ptd
6¼Conv w1Pin
6þw2ResizePin
7
w1þw2þe!
Pout
6¼Conv w1'Pin
6þw2'Ptd
6þw3'ResizePout
5
w1'þw2'þw3'þe!
ð2Þ
We calculate feature graph Ptd
6as follows. Pin
6and Pin
7are the
input feature maps. The resize function is used to change the
spatial resolution of the feature maps, and w1 and w2 are the
weight coefcients used to weight the input feature maps. eis a
small number that prevents the denominator from being zero
and is often called a smooth or stable term. These weighted and
resized feature maps are weighted again and summed, and then
FIGURE 3 Comparison of data augmentation with the original Mosaic and our improved version: (a) the original Mosaic data expanded and adjusted in hue
(H), saturation (S), and brightness (V) using the HSV model; (b) the improved Mosaic data augmentation with a modied colour gamut conversion module.
FIGURE 4 Operation of SPDConv. In SPDConv, the input feature
map is rst transformed through the SPD layer, followed by convolution
through the Conv layer.
CHEN ET AL.
-
5 of 14
[x] is divided by the weights and smoothed over. Conv represents
the convolution operation, which is applied to the normalised
weighted feature graph and results in Ptd
6.
Next, we calculate another feature graph: Pout
6. We extend
formula (2) and set w1ʹ,w2ʹ, and w3ʹas new weight
coefcients. This time except for and (from the rst for-
mula), it also added, which is the output feature map of the
previous scale adjusted by the resize function. As in for-
mula (2), the weighted feature graph is summed and then
divided by the sum of the ownership weight plus the
smoothing term. Then, we perform the convolution opera-
tion again to get Pout
6.
The BIFPN structure facilitates continuation of this pro-
cess to achieve seamless transmission and fusion of informa-
tion across various levels of the feature pyramid. This enables
the generation of outputs that contain an extensive range of
semantic information.
2.2.4
|
Occlusion aware attention network—
MultiSEAM
The introduction of the MultiSEAM module marked a sig-
nicant advancement in the area of blood cell detection
because it addressed the complexities introduced by occlusions
among blood cell types. Occlusions can severely impact
detection accuracy by affecting the nonmaximum suppression
threshold, which in turn may lead to the nondetection of cells;
this is especially common when cells of the same type occlude
each other and thus obscure key features and compromise cell
localisation.
MultiSEAM is a cuttingedge neural network module
tailored for rening feature recognition and enhancement in
intricate image analysis. It is an evolution of the conventional
Separation and Enhancement Attention Module (SEAM) [31]
and is particularly adept at handling intricate image datasets.
The groundbreaking aspect of MultiSEAM lies in its capacity
to simultaneously handle and integrate image features spanning
diverse dimensions ranging from nuanced local details to
sweeping macro structures.
In the context of blood cell analysis, MultiSEAM excels at
precisely segmenting a wide array of cell types through the
seamless fusion of multiscale features. This is achieved using
adaptive average pooling and a fully connected layer. Figure 6
presents the architecture of MultiSEAM. Here, we can see that
the channel and spatial mixing module employs patch embed-
ding to divide a large word vector matrix into smaller matrices
for processing. A GELU activation function and depthwise
convolution are used for the convolution operation. This is
followed by a pointtopoint convolution to enhance informa-
tion transmission between channels and improve network con-
nectivity. To improve model effectiveness in dealing with blood
cell occlusion, the output of the MultiSEAM module is multi-
plied by the original feature as the attention weight. This step
aims to compensate for potential information loss in occluded
scenes by learning the relationship between occluded and non
occluded blood cells.
2.3
|
Blood cell detection and classication
method
2.3.1
|
Dataset description
Both the training set and test set used in this experiment are
crafted from the BCCD, which is a limitedsize public dataset of
human blood cell images. BCCD contains 364 images of
different types of blood cells from peripheral blood. All blood
cell training set images contained a total of 9776 cell labels: 8310
red cell labels, 744 white cell labels, and 722 platelet labels. The
BCCD contains a small number of white blood cells and platelets
but a large number of red blood cells. It is difcult to detect these
cells due to overlapping cells, adhesion of cells, broken red blood
cells, and small target platelets. Since our algorithm uses Mosaic
FIGURE 5 Comparison of bidirectional feature pyramid network (BiFPN) and path aggregation network (PAN) network structures. (a) Network structure
of BiFPN: the downward arrow denotes a topdown path, which transmits the semantic information of highlevel features; the upward arrow denotes a bottom
up path that transmits the location information of lowlevel features; the curved arrow denotes a crossscale connection. By adding a jump connection and two
way path, the weighted fusion and twoway crossscale connection are realised. (b) PAN network structure, which utilises topdown and bottomup feature
transfer.
6 of 14
-
CHEN ET AL.
data enhancement at the input end, we do not carry out data
amplication in the data preprocessing stage.
2.3.2
|
Model use
Figure 7presents a owchart of the proposed method, which
consists of several key steps. These steps outline the process
from data preparation to model evaluation and are designed to
optimise the detection and classication of blood cells. A
detailed description of each step is provided below.
Step 1: In this initial step, the images are loaded into a
dataset, which serves as the foundation for subsequent
processing. The loaded images are separated into two sets: a
training set and a test set.
Step 2: We set up the YOLOv8 architecture to form the
basis of our proposed framework.
Step 3: Several modications are made to the standard
YOLOv8 to enhance performance for blood cell detection.
Key changes are made in the following substeps.
Step 3.1: Several standard convolutional layers within the
YOLOv8 backbone are replaced with SPDConv layers.
Step 3.2: The original neck network of YOLOv8 is
substituted with a BiFPN.
Step 3.3: We add SEAM and MultiSEAM to the neck
network.
Step 4: Here, the parameters of the NBCDCYOLOv8
model are netuned through comprehensive training.
Key parameters, including learning rate, batch size, and
optimiser, are optimised to achieve the best performance.
Details of the parameter settings used for tuning are dis-
cussed in section 3.2.
Step 5: The NBCDCYOLOv8 model is trained using the
training dataset, and its performance is evaluated using
metrics such as average precision (AP), mAP, and the
precision–recall curve. The efciency of the proposed
framework is assessed in terms of the number of model
parameters. Detailed evaluations are presented in sec-
tions 4.2 and 4.3.
Step 6: Lastly, we present the visual results obtained from
the proposed method. Detailed descriptions are provided in
section 4.6.
3
|
EXPERIMENTAL SETUP
3.1
|
Experimental environment
Our experiments were conducted in a controlled hardware and
software environment. Table 1provides a detailed breakdown
of the hardware and software used in the experiments.
3.2
|
Parameter setting
The key parameters used during model training and optimi-
sation are listed in Table 2. These parameters were selected and
netuned to achieve optimal performance for blood cell
detection and classication.
3.3
|
Evaluation metrics
To evaluate the performance of the NBCDCYOLOv8 model,
several key metrics were used: precision, recall, AP, mAP, and
frames per second (FPS).
The precision rate refers to the ratio of the correctly pre-
dicted cells (the proportion (TP)) to the total number of cor-
rect results (TP þFP) classied as positive samples in the
blood cell detection task. Precision is calculated as shown in
equation (3).
PðPrecisionÞ ¼ TP=ðTP þFPÞ ð3Þ
Recall signies TP of blood cells (TP þFP) that should be
detected when the true category of blood cells is positive and
the nal predicted result is also positive. Recall is calculated as
shown in equation (4).
FIGURE 6 MultiSEAM structure. Left:
architecture of MultiSEAM. Right: architecture of
MultiSEAM's CSMM. FC stands for full
connection. Through CSMM, depthwise
convolution is performed. Then, after average
pooling, [x] is connected to a fully connected layer.
The fully connected layer reassembles all local
features extracted from the convolutional and
pooling layers into a complete graph through a
weight matrix. CSMM, channel and spatial mixing
module.
CHEN ET AL.
-
7 of 14
RðRecallÞ ¼ TP=ðTP þFNÞ ð4Þ
We calculate the AP of a singlecategory model. Generally,
a higher AP indicates a more effective classier. Average pre-
cision is expressed as in equation (5).
AP ¼Z1
0
PRdR ð5Þ
The mAP measures our algorithm's performance in pre-
dicting the location and category of cells. Mean average pre-
cision represents the average of multiple AP values across
different categories. The mAP value should fall within the
range of 0–1, and a higher value indicates better performance.
Mean average precision is calculated as in equation (6). Here, n
represents the total number of categories.
mAP ¼X
n
i¼0
APi=nð6Þ
Frames per second serves as a metric to evaluate the speed
of image processing or model inference. It is expressed as in
equation (7).
FPS ¼1=Processing time per frame ð7Þ
Note that in the above formulae TP stands for the number
of true positives, FP stands for the number of false positives,
FN stands for the number of false negatives, nis the number
of categories, and APiis the average precision of category i.
4
|
EXPERIMENTAL RESULTS
4.1
|
Confusion matrix analysis
Confusion matrices are foundational for evaluating the perfor-
mance of classication models. They enable the calculation of
various crucial performance indicators to quantify a models'
efcacy in different categories. Figure 8shows the confusion
matrix for NBCDCYOLOv8. The model performs excep-
tionally well in classifying WBC and platelets: in both cases it
FIGURE 7 Flow chart of the proposed method.
TABLE 1Experimental platform conguration overview.
Component Description
Operating system Ubuntu 20.04
GPU NVIDIA RTX 4090
CPU Intel(R) Xeon(R) Platinum 8352V CPU @
2.10 GHz
Memory 90 GB
Accelerated
environment
CUDA 12.2
Programming language 3.8.10
Deep learning
framework
PyTorch 2.0
TABLE 2Key parameter conguration.
Parameter Value
Learning rate (initial) 0.01
Epochs 100
Batch size 48
Weight decay 0.0005
Class 3
Optimiser SGD
8 of 14
-
CHEN ET AL.
achieves an accuracy of 1.00. However, there is a small
misclassication rate where 1% of WBCs and 4% of platelets are
mistakenly classied as background. For RBC, the accuracy is
slightly lower at 0.92; here, 8% of RBCs are incorrectly classied
as background and 5% are misclassied as WBCs. In summary,
our model performs nearperfectly in classifying WBCs and
platelets, and we observe a slight drop in classication accuracy
when classifying RBCs as some are misclassied as background
or WBCs. These results highlight the model's robustness but also
point out areas for improvement in RBC classication.
4.2
|
PR curve analysis
Figure 9presents the precision–recall curves for each blood
cell type. The proposed model performs exceptionally well in
detecting WBCs: it has a precision of 0.988 and consistently
high recall, which together indicate minimal false positives. In
RBC detection, our model has a precision of 0.897. Precision
decreases at higher recall levels, and thus there is room for
improvement. Lastly, for platelet detection, our model shows
strong and stable performance with a precision of 0.955.
Overall, the model achieves an impressive mAP@0.5 of 0.947.
This result showcases its effectiveness, particularly in detecting
small, dense, and overlapping blood cells.
4.3
|
Comparative experiment
Here we compare the improved NBCDCYOLOv8 model
with several mainstream onestage and twostage target
detection models including Faster RCNN, SSD, YOLOv3,
YOLOv4, YOLOv5, and YOLOv8. To ensure fair compari-
son, all experiments were conducted under the same condi-
tions and using an intersection over union (IOU) threshold of
0.5. As demonstrated in Table 3, our proposed model achieves
notable enhancements in both detection accuracy and mAP
over the other models.
The NBCDCYOLOv8 model achieves an mAP up to
94.7%, which highlights its exceptional ability to accurately
detect and classify various types of blood cells across all tested
datasets. The overall detection rate also sees substantial im-
provements that would make it highly effective in scenarios
where precise classication is required.
Despite these advancements, there are slight tradeoffs in
terms of recall (R) and speed (FPS) when compared to some of
the other models (e.g. YOLOv5s and YOLOv8n). The slightly
lower recall in NBCDCYOLOv8 is due to its design, which
prioritises precision and is especially important for detecting
small and complex objects such as platelets. Moreover, the
integration of advanced modules such as BiFPN and SPD
layers enhances feature extraction for small targets but makes
the model more conservative, thereby leading to slightly lower
recall. In terms of speed, NBCDCYOLOv8 shows slower
FIGURE 8 Confusion Matrix of NBCDCYOLOv8. The confusion
matrix shows the true category and the category predicted by the
classication model.
FIGURE 9 PR curve.
CHEN ET AL.
-
9 of 14
inference due to its more complex architecture and 466 layers
than YOLOv8's 225 layers. The inclusion of MultiSEAM and
rened feature extraction layers boosts accuracy but increases
computational load, thus reducing FPS. While lightweight
models such as YOLOv8n focus on speed, NBCDCYOLOv8
sacrices some speed for improved detection performance but
still achieves a realtime FPS of 98.69.
In conclusion, although NBCDCYOLOv8 demonstrates
slightly lower recall and speed than some lighter models, its
ability to deliver higher precision and mAP makes it the most
balanced and effective model for blood cell detection. The
tradeoff between speed and accuracy is a result of the added
complexity aimed at improving detection performance partic-
ularly for smaller or more challenging objects. As a result,
NBCDCYOLOv8 offers a comprehensive performance
advantage while balancing accuracy and realtime capabilities.
4.4
|
Ablation experiment
As shown in Table 4, the NBCDCYOLOv8 model demon-
strates outstanding performance in terms of precision (88.0%)
and mAP (94.7%). However, its recall rate (92.5%) is slightly
lower than that of some other congurations; for example,
YOLOv8 þSPD and YOLOv8 þMultiSEAM achieve recall
rates of 96.8% and 95.0%, respectively. The decrease in recall
rate is because NBCDCYOLOv8 integrates multiple
advanced components (including a BiFPN, MultiSEAM, SPD,
and four detection heads), which signicantly improve feature
extraction and object localisation. These enhancements are
designed to increase the model's precision, especially in
detecting smaller and more challenging objects such as plate-
lets. However, such a focus on precision can sometimes lead to
more conservative detection behaviour, where the model
TABLE 3Performance comparison of
NBCDCYOLOv8 and other models.
Model p(%) R (%) Class AP (%) Speed (ms) FPS mAP (%)
Faster RCNN 70.11 88.99 RBC 83.44 16.71 45.54 84.38
WBC 97.56
Platelets 72.14
SSD 80.94 67.50 RBC 76.33 7.25 137.78 82.39
WBC 97.33
Platelets 73.53
YOLOv3 84.33 24.85 RBC 70.43 10.80 92.57 50.52
WBC 68.34
Platelets 12.78
YOLOv4 68.75 20.32 RBC 45.39 16.70 59.82 34.31
WBC 31.81
Platelets 25.73
YOLOv5s 86.50 90.50 RBC 84.60 11.70 85.47 91.00
WBC 98.20
Platelets 90.20
YOLOv8n 85.50 92.70 RBC 86.60 17.30 51.28 92.40
WBC 98.70
Platelets 91.90
YOLOv8s 87.10 92.70 RBC 88.30 18.10 55.25 94.20
WBC 99.40
Platelets 94.90
YOLOv8m 86.60 92.40 RBC 87.40 17.90 55.87 93.20
WBC 99.10
Platelets 93.20
NBCDCYOLOv8 (ours) 88.00 92.50 RBC 89.70 10.10 98.69 94.70
WBC 98.90
Platelets 95.50
Note: Bold values are used to emphasize key results of our model and highlight its signicance.
10 of 14
-
CHEN ET AL.
prioritises avoiding false positives and thus potentially misses
some true positives in the process. This tradeoff between
precision and recall is common in highly accurate models,
where the purpose is to minimise misclassication at the
expense of slightly reduced recall.
In conclusion, while NBCDCYOLOv8 achieves superior
performance in terms of precision and mAP, the slightly lower
recall is the result of the model's design. By focussing on
maximising precision and improving small object detection,
the model sacrices some recall in favour of reducing false
positives. Despite this tradeoff, the overall performance of
NBCDCYOLOv8 remains highly competitive and makes it
the most balanced model for blood cell detection tasks, where
both precision and realtime detection are critical.
4.5
|
Model generalisability
To thoroughly evaluate the generalisability of the proposed
NBCDCYOLOv8 model, we tested it on multiple datasets
with varying characteristics.
The BCCD is a widely used blood cell classication dataset
and serves as the primary training set. Here, there is a focus on
common blood cell types. Our model achieved its highest mAP
(94.7%) on this dataset, which indicates strong performance in
familiar conditions.
The CBC dataset contains blood cell images collected from a
different medical institution, and it feature variations in cell
morphology and imaging conditions. When tested on the CBC
dataset, the model maintained a high precision (90.2%) and mAP
(93.8%). Although there was a slight drop in recall and mAP
compared to when employed on the BCCD, the model gener-
alised well across datasets with moderate variations in
conditions.
We generated a synthetic dataset by applying trans-
formations to existing images to simulate realworld challenges
common in blood microscopy images. This included adding
noise, blurring, occlusion, and overlapping effects. On this
dataset, our model achieved a precision of 86.0% but showed
signicant drops in recall (56.4%) and mAP (73.0%). This
suggests that the model struggles to handle highly complex and
noisy environments and it missed a large proportion of true
positives. Therefore, there is need for further optimisation to
such conditions.
The Leukaemia Dataset is a specialised dataset containing
images of blood cells affected by leukaemia and it tests a
model's ability to detect abnormal cell types. On this dataset,
our model's precision decreased to 82.0% and its mAP
decreased to 70.3%. We also observed a signicant drop in
recall (56.4%). This lower performance suggests that the model
requires further improvements for detecting rare or abnormal
cell types.
Table 5summarises the crossdataset performance of
NBCDCYOLOv8. The model's slightly reduced performance
on the CBC and synthetic datasets indicates that, while it
generalises reasonably well across datasets with different
characteristics, further netuning is needed to handle more
complex or abnormal conditions effectively. This highlights the
model's strengths in detecting common blood cell types and its
potential to be improved for more challenging scenarios.
4.6
|
Qualitative analysis of blood cell
detection
Figure 10 qualitatively analyses the performance of NBCDC
YOLOv8 when faced with three key challenges: uneven distri-
bution, high density, and mutual occlusion of blood cells. For
uneven distribution, as shown in Figure 10a, where white blood
cells are isolated and red blood cells are sparsely distributed,
YOLOv8 (Figure 10) struggles with missed and false detections;
this is particularly the case in the lower sparse regions. In
contrast, NBCDCYOLOv8 (Figure 10c) demonstrates more
accurate detection due to its multiscale feature fusion capability.
TABLE 4Ablation experiment.
Model Precision Recall mAP@0.5
YOLOv8 81.8 94.2 91.6
YOLOv8 þSPD 82.2 96.8 94.0
YOLOv8 þMultiSEAM 84.6 95.0 94.1
YOLOv8 þDetectHead 4 87.5 91.5 93.7
YOLOv8 þBiFPN 86.9 93.3 94.4
YOLOv8 þBiFPN þMultiSEAM 86.4 90.7 92.1
YOLOv8 þBiFPN þMultiSEAM þSPD 84.6 94.1 93.7
YOLOv8 þBiFPN þMultiSEAM þSPD þDetectHead £4(NBCDCYOLOv8) 88.0 92.5 94.7
Note: Bold values are used to emphasize key results of our model and highlight its signicance.
TABLE 5Comparison of results across different datasets.
Dataset Precision Recall mAP@0.5
BCCD 88.0 92.5 94.7
CBC 90.2 88.6 93.8
Synthetic 80.7 52.3 55.0
Leukaemia 56.4 88.0 70.3
CHEN ET AL.
-
11 of 14
FIGURE 10 Comparison of NBCDCYOLOv8 network's improvement on false detection and missed detection of adherent red blood cells. (a) Original
image, (b) image produced by YOLOv8, and (c) image produced by NBCDCYOLOv8.
FIGURE 11 Loss curve: The abscissa represents the number of iterations, while the ordinate represents the loss value. This demonstrates the variation
process of the loss function during model training.
For highdensity regions, particularly in the lower right corner of
Figure 10b, YOLOv8 exhibits inaccurate bounding boxes and
false positives when detecting closely packed red blood cells.
However, NBCDCYOLOv8 (Figure 10c) effectively distin-
guishes adjacent cells with greater precision, thereby minimising
false positives. Lastly, in the case of mutual occlusion YOLOv8
shows incomplete bounding boxes and missed detections.
NBCDCYOLOv8 signicantly improves detection accuracy by
correctly identifying overlapping cells and resolving occlusions.
These qualitative results illustrate NBCDCYOLOv8's superior
performance in addressing the challenges posed by uneven
distribution, high density, and mutual occlusion in blood cell
detection.
5
|
DISCUSSION
To verify the convergence effect of the proposed algorithm, we
present three blood cell loss curves of the improved NBCDC
YOLOv8 in Figure 11. It can be observed that the loss value
decreases gradually and tends to stabilise when the number of
iterations reaches approximately 50. This suggests that the al-
gorithm can converge rapidly, which indicates high training
efciency.
Table 6compares the model and its accuracy from the
present study with those from previous studies. The proposed
NBCDCYOLOv8 model notably outperforms previous cell
detection techniques with its impressive AP of 94.70%.
The NBCDCYOLOv8 model offers signicant improve-
ments in terms of precision and mAP while maintaining real
time detection capabilities. This makes it especially useful in
detecting smaller objects such as platelets. However, it dem-
onstrates a slight reduction in recall due to its focus on mini-
mising false positives, and its increased computational
complexity may restrict its functionality in resourcepoor en-
vironments. Future research could focus on optimising the
model to improve recall without sacricing precision and
simplifying the architecture to reduce computational demands.
Such changes would make the model more suitable for a
broader range of practical applications.
12 of 14
-
CHEN ET AL.
6
|
CONCLUSIONS
This paper presents the NBCDCYOLOv8 algorithm for
realtime, efcient blood cell detection. It addresses challenges
such as variable blood cell scale, limited data, dense occlusion,
and low accuracy from erythrocyte adhesion. To compensate
for having limited data, we apply Mosaic data enhancement
and splicing four images to enrich data characteristics. We
then introduce SPDConv to process lowresolution blood
microscope images effectively. To handle scale variations and
adhesion, we employ a BiFPNinspired feature fusion struc-
ture that enhances multiscale feature fusion and improves
feature expression of red blood cells. Next, we introduce a
MultiSEAM module to accurately position densely packed
cells, thereby improving detection accuracy. For small target
detection, four detection heads are added to increase platelet
recognition. The algorithm achieved a mAP of 88.8% for red
blood cells, 98.6% for white blood cells, and 89.5% for
platelets. Moreover, its detection rate on the BCCD was
94.7%, which surpassed that of previous methods. Using the
CBC dataset to further test the model, the mAP reached
93.8%, thus indicating that the model has good general-
isability and robustness. Future work will explore diverse cell
data for clinical applications and broad use in medical image
analysis.
AUTHOR CONTRIBUTIONS
Xuan Chen: Conceptualization; Data curation; Formal analysis;
Methodology; Software; Writing—original draft; Writing—re-
view & editing. Linxuan Li: Conceptualization; Data curation;
Formal analysis; Visualization. Xiaoyu Liu: Investigation;
Validation. Fengjuan Yin: Supervision; Validation; Visualiza-
tion; Writing—review & editing. Xue Liu: Project administra-
tion; Supervision. Xiaoxiao Zhu: Supervision; Validation.
Yufeng Wang: Funding acquisition; Resources; Supervision.
Fanbin Meng: Conceptualization; Resources; Supervision.
ACKNOWLEDGEMENTS
The authors are grateful for the support and guidance from
Jining Medical University. This work was supported by the Jining
Medical University Classroom Teaching Reform Research
Project (Grant No. 2022KT012) and the Innovation and
Entrepreneurship Training Programme for College Students
(Grant Nos. 202210443002, 202210443003, S202310443006,
cx2023094z and cx2022044z).
CONFLICT OF INTEREST STATEMENT
The authors declare no conicts of interest.
DATA AVAILABILITY STATEMENT
You can access our research code at the following GitHub re-
pository: https://github.com/FanbinMengGroup/NBCDC.git.
ORCID
Xuan Chen
https://orcid.org/0009-0002-0129-4832
REFERENCES
1. Roland, L., Drillich, M., Iwersen, M.: Hematology as a diagnostic tool in
bovine medicine. J. Vet. Diagn. Invest. 26(5), 592–598 (2014). https://
doi.org/10.1177/1040638714546490
2. Atkins, C.G., et al.: Raman spectroscopy of blood and blood compo-
nents. Appl. Spectrosc.: Soc. Appl. Spectrosc. 71(5), 767–793 (2017).
https://doi.org/10.1177/0003702816686593
3. Hussein, S., et al.: Automatic segmentation and quantication of white
and Brown adipose tissues from PET/CT scans. IEEE Trans. Med.
Imag. 36(3), 734–744 (2017). https://doi.org/10.1155/2014/979302
4. ChabotRichards, D.S., George, T.I.: White blood cell counts reference
methodology. Clin. Lab. Med. 35(1), 11–24 (2015). https://doi.org/10.
3389/fmed.2018.00084/full
5. Garraud, O., Tissot, J.D.: Blood and blood components: from similarities
to differences. Front. Med. 5, 84 (2018). https://doi.org/10.3389/fmed.
2018.00084
6. Borland, D., et al.: Segmentor: a tool for manual renement of 3D mi-
croscopy annotations. BMC Bioinf. 22(1):260 (2021). https://doi.org/10.
1186/s12859021042028
7. Acharjee, S., et al.: A Semiautomated Approach Using GUI for the
Detection of Red Blood Cells, pp. 525–529 (2016)
8. Asghar, R., Kumar, S., Mahfooz, A.: Classication of Blood Cells Using
Deep Learning Models (2023)
9. Alam, M.M., Islam, M.T.: Machine learning approach of automatic
identication and counting of blood cells. Healthc. Technol. Lett. 6(4),
103–108 (2019). https://doi.org/10.1049/htl.2018.5098
10. Khan, S., et al.: A review on traditional machine learning and deep
learning models for WBCs classication in blood smear images. IEEE
Access 9, 10657–10673 (2021). https://doi.org/10.1109/ACCESS.2020.
3048172
TABLE 6Comparison with
stateoftheart methods. Model Year Dataset mAP (%)
FasterRCNN [19] 2017 BCCD 76.50
CycleGAN [32] 2020 BCCD 83.90
ISEYOLO [33] 2021 BCCD 85.70
Integrated attention mechanism and YOLOv5 [34] 2024 BCCD 89.60
Fast and efcient YOLOv3 [35] 2021 BCCD 89.86
CNN architecture based on YOLO [36] 2022 BCCD 91.13
TEYOLOFB [37] 2021 BCCD 91.90
AYOLOv5 [38] 2024 BCCD 93.30
NBCDCYOLOv8(ours) BCCD 94.70
Note: Bold values are used to emphasize key results of our model and highlight its signicance.
CHEN ET AL.
-
13 of 14
11. Das, P.K., et al.: A systematic review on recent advancements in deep and
machine learning based detection and classication of acute lymphoblastic
leukemia. IEEE Access 10, 81741–81763 (2022). https://doi.org/10.
1109/ACCESS.2022.3196037
12. Mulya, R.F., Utami, E., Ariatmanto, D.: Classication of acute lympho-
blastic leukemia based on white blood cell images using InceptionV3
model. J. RESTI (Rekayasa Sistem dan Teknologi Informasi) 7(4), 947–
952 (2023). https://doi.org/10.29207/resti.v7i4.5182
13. Rahat, I., et al.: A step towards automated haematology: DL models for
blood cell detection and classication. EAI Endorsed Trans. Pervasive
Health Technol. 10 (2024). https://doi.org/10.4108/eetpht.10.5477
14. Zhang, D., Zhang, P., Wang, L.: Cell Counting Algorithm Based on
YOLOv3 and Image Density Estimation, pp. 920–924 (2019)
15. Moallem, G., et al.: Detecting and Segmenting Overlapping Red Blood
Cells in Microscopic Images of Thin Blood Smears (2018)
16. Shahzad, M., et al.: Blood cell image segmentation and classication: a
systematic review. PeerJ Comput. Sci. 10, e1813 (2024). https://doi.org/
10.7717/peerjcs.1813
17. Cheng, W., et al.: Application of image recognition technology in path-
ological diagnosis of blood smears. Clin. Exp. Med. 24(1):181 (2024).
https://doi.org/10.1007/s1023802401379z
18. Girshick, R., et al.: Rich feature hierarchies for accurate object detection
and semantic segmentation. IEEE Comput. Soc., 580–587 (2014).
https://doi.org/10.1109/CVPR.2014.81
19. Ren, S., et al.: Faster RCNN: towards realtime object detection with
region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6),
1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
20. Redmon, J., et al.: You only Look once: unied, realtime object detec-
tion. In: 2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 779–788 (2015). https://doi.org/10.1109/
CVPR.2016.91
21. Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger, pp. 6517–
6525 (2017)
22. Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement (2018)
arXiv eprints;abs/1804.02767:6. https://doi.org/10.48550/arXiv.1804.
02767
23. Berg, A.C., et al.: SSD: Single Shot MultiBox Detector, vol. 9905, pp. 21–
37 (2015). https://doi.org/10.1007/9783319464480_2
24. Feng, C., et al.: TOOD: TaskAligned OneStage Object Detection, pp.
3490–3499 (2021). https://doi.org/10.1109/ICCV48922.2021.00349
25. Li, X., et al.: Generalized Focal Loss: Learning Qualied and Distributed
Bounding Boxes for Dense Object Detection, pp. 21002–21012 (2020)
ArXiv;abs/2006.04388. https://doi.org/10.48550/arXiv.2006.04388
26. Hu, B., et al.: A Preliminary Study on Data Augmentation of Deep
Learning for Image Classication, pp. 117–122 (2019). https://doi.org/
10.1145/3361242.3361259
27. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: Optimal Speed
and Accuracy of Object Detection (2020). abs/2004:10934. https://doi.
org/10.48550/arXiv.2004.10934
28. Yun, S., et al.: CutMix: Regularization Strategy to Train Strong Classiers
with Localizable Features, pp. 6022–6031 (2019)
29. Sunkara, R., Luo, T.: No More Strided Convolutions or Pooling: A New
CNN Building Block for LowResolution Images and Small Objects (2022)
30. Tan, M., Pang, R., Le, Q.V.: EfcientDet: Scalable and Efcient Object
Detection, pp. 10778–10787 (2019). https://doi.org/10.48550/arxiv.
1911.09070
31. Yu, Z., et al.: YOLOFaceV2: A Scale and Occlusion Aware Face De-
tector, 18 (2022). ArXiv;abs/2208.02019. https://doi.org/10.48550/
arXiv.2208.02019
32. He, J., et al.: CycleGAN with an improved loss function for cell detection
using partly labeled images. IEEE J. Biomed. Health Inform. 24(9),
2473–2480 (2020). https://doi.org/10.1109/JBHI.2020.2970091
33. Liu, C., et al.: Improved squeezeandexcitation attention module based
YOLO for. Blood Cells Detection, 3911–3916 (2021)
34. Shahin, O.R., et al.: Optimized automated blood cells analysis using
Enhanced Greywolf Optimization with integrated attention mechanism
and YOLOv5. Alex. Eng. J. 109, 58–70 (2024). https://doi.org/10.1016/
j.aej.2024.08.054
35. Shakarami, A., et al.: A fast and yet efcient YOLOv3 for blood cell
detection. Biomed. Signal Process Control. 66, 66 (2021). https://doi.
org/10.1016/j.bspc.2021.102495
36. Amudhan, A.N., et al.: RFSOD: a lightweight singlestage detector for
realtime embedded applications to detect smallsize objects. J. RealTime
Image Process 19(1), 133–146 (2022). https://doi.org/10.1007/s11554
021011703
37. Xu, F., et al.: TEYOLOF: tiny and efcient YOLOF for blood cell
detection. Biomed. Signal Process Control 73, 103416 (2022). https://
doi.org/10.1007/s11554021011703
38. Gu, W., Sun, K.: AYOLOv5: improved YOLOv5 based on attention
mechanism for blood cell detection. Biomed. Signal Process Control 88:
105034 (2024). https://doi.org/10.1016/j.bspc.2023.105034
SUPPORTING INFORMATION
Additional supporting information can be found online in the
Supporting Information section at the end of this article.
How to cite this article: Chen, X., et al.: NBCDC
YOLOv8: a new framework to improve blood cell
detection and classication based on YOLOv8. IET
Comput. Vis. e12341 (2025). https://doi.org/10.1049/
cvi2.12341
14 of 14
-
CHEN ET AL.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Traditional manual blood smear diagnosis methods are time-consuming and prone to errors, often relying heavily on the experience of clinical laboratory analysts for accuracy. As breakthroughs in key technologies such as neural networks and deep learning continue to drive digital transformation in the medical field, image recognition technology is increasingly being leveraged to enhance existing medical processes. In recent years, advancements in computer technology have led to improved efficiency in the identification of blood cells in blood smears through the use of image recognition technology. This paper provides a comprehensive summary of the methods and steps involved in utilizing image recognition algorithms for diagnosing diseases in blood smears, with a focus on malaria and leukemia. Furthermore, it offers a forward-looking research direction for the development of a comprehensive blood cell pathological detection system.
Article
Full-text available
INTRODUCTION: Deep Learning has significantly impacted various domains, including medical imaging and diagnostics, by enabling accurate classification tasks. This research focuses on leveraging deep learning models to automate the classification of different blood cell types, thus advancing hematology practices.OBJECTIVES: The primary objective of this study is to evaluate the performance of five deep learning models - ResNet50, AlexNet, MobileNetV2, VGG16, and VGG19 - in accurately discerning and classifying distinct blood cell categories: Eosinophils, Lymphocytes, Monocytes, and Neutrophils. The study aims to identify the most effective model for automating hematology processes.METHODS: A comprehensive dataset containing approximately 8,500 augmented images of the four blood cell types is utilized for training and evaluation. The deep learning models undergo extensive training using this dataset. Performance assessment is conducted using various metrics including accuracy, precision, recall, and F1-score.RESULTS: The VGG19 model emerges as the top performer, achieving an impressive accuracy of 99% with near-perfect precision and recall across all cell types. This indicates its robustness and effectiveness in automated blood cell classification tasks. Other models, while demonstrating competence, do not match the performance levels attained by VGG19.CONCLUSION: This research underscores the potential of deep learning in automating and enhancing the accuracy of blood cell classification, thereby addressing the labor-intensive and error-prone nature of traditional methods in hematology. The superiority of the VGG19 model highlights its suitability for practical implementation in real-world scenarios. However, further investigation is warranted to comprehend model performance variations and ensure generalization to unseen data. Overall, this study serves as a crucial step towards broader applications of artificial intelligence in medical diagnostics, particularly in the realm of automated hematology, fostering advancements in healthcare technology.
Article
Full-text available
Background Blood diseases such as leukemia, anemia, lymphoma, and thalassemia are hematological disorders that relate to abnormalities in the morphology and concentration of blood elements, specifically white blood cells (WBC) and red blood cells (RBC). Accurate and efficient diagnosis of these conditions significantly depends on the expertise of hematologists and pathologists. To assist the pathologist in the diagnostic process, there has been growing interest in utilizing computer-aided diagnostic (CAD) techniques, particularly those using medical image processing and machine learning algorithms. Previous surveys in this domain have been narrowly focused, often only addressing specific areas like segmentation or classification but lacking a holistic view like segmentation, classification, feature extraction, dataset utilization, evaluation matrices, etc . Methodology This survey aims to provide a comprehensive and systematic review of existing literature and research work in the field of blood image analysis using deep learning techniques. It particularly focuses on medical image processing techniques and deep learning algorithms that excel in the morphological characterization of WBCs and RBCs. The review is structured to cover four main areas: segmentation techniques, classification methodologies, descriptive feature selection, evaluation parameters, and dataset selection for the analysis of WBCs and RBCs. Results Our analysis reveals several interesting trends and preferences among researchers. Regarding dataset selection, approximately 50% of research related to WBC segmentation and 60% for RBC segmentation opted for manually obtaining images rather than using a predefined dataset. When it comes to classification, 45% of the previous work on WBCs chose the ALL-IDB dataset, while a significant 73% of researchers focused on RBC classification decided to manually obtain images from medical institutions instead of utilizing predefined datasets. In terms of feature selection for classification, morphological features were the most popular, being chosen in 55% and 80% of studies related to WBC and RBC classification, respectively. Conclusion The diagnostic accuracy for blood-related diseases like leukemia, anemia, lymphoma, and thalassemia can be significantly enhanced through the effective use of CAD techniques, which have evolved considerably in recent years. This survey provides a broad and in-depth review of the techniques being employed, from image segmentation to classification, feature selection, utilization of evaluation matrices, and dataset selection. The inconsistency in dataset selection suggests a need for standardized, high-quality datasets to strengthen the diagnostic capabilities of these techniques further. Additionally, the popularity of morphological features indicates that future research could further explore and innovate in this direction.
Article
Full-text available
Acute Lymphoblastic Leukemia (ALL) is the most prevalent form of leukemia that occurs in children. Detection of ALL through white blood cell image analysis can assist in prognosis and appropriate treatment. In this study, the author proposes an approach for classifying ALL based on white blood cell images using a Convolutional Neural Network (CNN) model called InceptionV3. The dataset used in this research consists of white blood cell images collected from patients with ALL and healthy individuals. These images were obtained from The Cancer Imaging Archive (TCIA), which is a service for storing large-scale cancer medical images available to the public. During the evaluation phase, the author used training data evaluation metrics such as accuracy and loss to measure the model's performance. The research results show that the InceptionV3 model is capable of classifying white blood cell images with a high level of accuracy. This model achieves an average ALL recognition accuracy of 0.9896 with a loss of 0.031. The use of CNN models like InceptionV3 in medical image analysis has the potential to enhance the efficiency and accuracy of image-based disease diagnosis.
Chapter
Full-text available
Convolutional neural networks (CNNs) have made resounding success in many computer vision tasks such as image classification and object detection. However, their performance degrades rapidly on tougher tasks where images are of low resolution or objects are small. In this paper, we point out that this roots in a defective yet common design in existing CNN architectures, namely the use of strided convolution and/or pooling layers, which results in a loss of fine-grained information and learning of less effective feature representations. To this end, we propose a new CNN building block called SPD-Conv in place of each strided convolution layer and each pooling layer (thus eliminates them altogether). SPD-Conv is comprised of a space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer, and can be applied in most if not all CNN architectures. We explain this new design under two most representative computer vision tasks: object detection and image classification. We then create new CNN architectures by applying SPD-Conv to YOLOv5 and ResNet, and empirically show that our approach significantly outperforms state-of-the-art deep learning models, especially on tougher tasks with low-resolution images and small objects. We have open-sourced our code at https://github.com/LabSAINT/SPD-Conv.
Article
Full-text available
Automatic Leukemia or blood cancer detection is a challenging job and is very much required in healthcare centers. It has a significant role in early diagnosis and treatment planning. Leukemia is a hematological disorder that starts from the bone marrow and affects white blood cells (WBCs). Microscopic analysis of WBCs is a preferred approach for an early detection of Leukemia since it is cost-effective and less painful. Very few literature reviews have been done to demonstrate a comprehensive analysis of deep and machine learning-based Acute Lymphoblastic Leukemia (ALL) detection. This article presents a systematic review of the recent advancements in this knowledge domain. Here, various artificial intelligence-based ALL detection approaches are analyzed in a systematic manner with merits and demits. The review of these schemes is conducted in a structured manner. For this purpose, segmentation schemes are broadly categorized into signal and image processing-based techniques, conventional machine learning-based techniques, and deep learning-based techniques. Conventional machine learning-based ALL classification approaches are categorized into supervised and unsupervised machine learning is presented. In addition, deep learning-based classification methods are categorized into Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and the Autoencoder. Then, CNN-based classification schemes are further categorized into conventional CNN, transfer learning, and other advancements in CNN. A brief discussion of these schemes and their importance in ALL classification are also presented. Moreover, a critical analysis is performed to present a clear idea about the recent research in this field. Finally, various challenging issues and future scopes are discussed that may assist readers in formulating new research problems in this domain.
Preprint
In recent years, face detection algorithms based on deep learning have made great progress. These algorithms can be generally divided into two categories, i.e. two-stage detector like Faster R-CNN and one-stage detector like YOLO. Because of the better balance between accuracy and speed, one-stage detectors have been widely used in many applications. In this paper, we propose a real-time face detector based on the one-stage detector YOLOv5, named YOLO-FaceV2. We design a Receptive Field Enhancement module called RFE to enhance receptive field of small face, and use NWD Loss to make up for the sensitivity of IoU to the location deviation of tiny objects. For face occlusion, we present an attention module named SEAM and introduce Repulsion Loss to solve it. Moreover, we use a weight function Slide to solve the imbalance between easy and hard samples and use the information of the effective receptive field to design the anchor. The experimental results on WiderFace dataset show that our face detector outperforms YOLO and its variants can be find in all easy, medium and hard subsets. Source code in https://github.com/Krasjet-Yu/YOLO-FaceV2