Access to this full-text is provided by Wiley.
Content available from IET Computer Vision
This content is subject to copyright. Terms and conditions apply.
Received: 23 May 2024
-
Revised: 12 December 2024
-
Accepted: 2 January 2025
-
IET Computer Vision
DOI: 10.1049/cvi2.12341
ORIGINAL RESEARCH
NBCDC‐YOLOv8: A new framework to improve blood cell
detection and classication based on YOLOv8
Xuan Chen
1
|Linxuan Li
1
|Xiaoyu Liu
1
|Fengjuan Yin
1
|Xue Liu
2
|
Xiaoxiao Zhu
3
|Yufeng Wang
1
|Fanbin Meng
1
1
School of Medical Information Engineering, Jining
Medical University, Rizhao, China
2
College of Basic Medicine, Jining Medical University,
Jining, China
3
Respiratory Medicine Department, Rizhao
Traditional Chinese Medicine Hospital, Rizhao,
China
Correspondence
Yufeng Wang and Fanbin Meng, School of Medical
Information Engineering, Jining Medical University,
Rizhao, China.
Email: wyf@mail.jnmc.edu.cn and drmeng@mail.
jnmc.edu.cn
Funding information
Innovation and Entrepreneurship Training Program
for College Students, Grant/Award Numbers:
202210443002, 202210443003, S202310443006,
cx2022044z, cx2023094z; Jining Medical University
Classroom Teaching Reform Research Project,
Grant/Award Number: 2022KT012
Abstract
In recent years, computer technology has successfully permeated all areas of medicine
and its management, and it now offers doctors an accurate and rapid means of diagnosis.
Existing blood cell detection methods suffer from low accuracy, which is caused by the
uneven distribution, high density, and mutual occlusion of different blood cell types in
blood microscope images, this article introduces NBCDC‐YOLOv8: a new framework to
improve blood cell detection and classication based on YOLOv8. Our framework in-
novates on several fronts: it uses Mosaic data augmentation to enrich the dataset and add
small targets, incorporates a space to depth convolution (SPD‐Conv) tailored for cells
that are small and have low resolution, and introduces the Multi‐Separated and
Enhancement Attention Module (MultiSEAM) to enhance feature map resolution.
Additionally, it integrates a bidirectional feature pyramid network (BiFPN) for effective
multi‐scale feature fusion and includes four detection heads to improve recognition ac-
curacy of various cell sizes, especially small target platelets. Evaluated on the Blood Cell
Classication Dataset (BCCD), NBCDC‐YOLOv8 obtains a mean average precision
(mAP) of 94.7%, and thus surpasses the original YOLOv8n by 2.3%.
KEYWORDS
bidirectional feature pyramid network, cell detection, mosaic data augmentation, multi‐separated and
enhancement attention module, space to depth convolution
1
|
INTRODUCTION
Blood cell detection and classication are pivotal in clinical
diagnosis and crucial for the subsequent identication and
treatment of a wide array of diseases. Cell counting [1] is an
essential process for evaluating the number of various cell
types in a patient's blood, and blood cell detection and clas-
sication are primarily used for this purpose. The complete
blood count (CBC) is a standard blood test that assesses an
individual's overall health and aids in diagnosing a wide range
of conditions. This test specically gauges the primary cell
types found in the blood: red blood cells, white blood cells, and
platelets [2]. The red blood cell count (RBC) reects the ca-
pacity for oxygen transport, the white blood cell count (WBC)
reects the condition of the immune system, and the platelet
count relates to coagulation abilities. Additionally, the CBC
encompasses haemoglobin concentration, red blood cell vol-
ume distribution width (RDW), mean red blood cell volume
(MCV), and other essential markers [3]. These components
assist medical professionals in comprehending patients' blood
proles and potential health concerns [4, 5]. Traditional blood
cell counting relies on manual microscopic examination [6],
which poses challenges such as operational complexity and
increased susceptibility to statistical and observational errors
when dealing with large volumes of samples [7].
In response to these challenges, researchers have been
exploring the utilisation of deep learning (DL) technology in
medical image analysis and seeking new automated solutions
[8, 9]. In 2021, Siraj Khan et al. comprehensively examined the
use of traditional machine learning (TML) and DL in
This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the
original work is properly cited, the use is non‐commercial and no modications or adaptations are made.
© 2025 The Author(s). IET Computer Vision published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology.
IET Comput. Vis. 2025;e12341. wileyonlinelibrary.com/journal/cvi2
-
1 of 14
https://doi.org/10.1049/cvi2.12341
distinguishing white blood cells from blood smear images [10].
They underscored the efcacy and precision of these tech-
nologies in clinical diagnosis and especially in detecting hae-
matological diseases such as leukaemia. Similarly, in 2022,
Pradeep Kumar Das et al. assessed the contributions of DL
and ML in identifying acute lymphoblastic leukaemia (ALL)
[11]. They illustrated the promise of convolutional neural
networks (CNNs) in analysing intricate blood cell images and
suggested methods to further enhance detection accuracy.
Current methodologies for detecting and classifying blood cells
can be divided into two principal approaches: TML and DL
approaches. Traditional machine learning methods rely on
handcrafted feature extraction techniques and algorithms such
as SVM and k‐NN. In contrast, DL approaches utilise models
such as CNNs that automatically learn features from images,
which results in signicantly enhanced detection accuracy.
Rizki Firdaus Mulya et al. used the InceptionV3 model to
recognise white blood cell images [12] and proposed a new
method for leukaemia classication detection, which signi-
cantly improved early diagnosis accuracy. Subsequently, in
2024, Irfan Sadiq Rahat et al. evaluated the performance of
several DL models on blood smear images [13] in terms of
precision of automated detection and classication. Similarly,
Dongdong Zhang et al. proposed a cell counting algorithm
utilising YOLOv3 and image density estimation [14], thereby
further enhancing the precision in detection and counting of
red and white blood cells.
In blood smear images, the distribution of white blood
cells is sparse than that of red blood cells, which makes them
easier to count. Conversely, red blood cells pose challenges for
detection and counting due to their dense arrangement and
tendency to overlap and adhere to each other [15, 16]. Addi-
tionally, platelets are small targets in a complex background
and hence are often more difcult to count. Wangxinjun
Cheng et al. pointed out that traditional image processing
methods face signicant limitations when dealing with the
complex environment of blood cells [17] whereas DL methods
have signicantly improved detection accuracy thanks to their
automatic feature learning; this improvement is especially
observed in the pathological detection of diseases, such as
leukaemia and malaria. Therefore, we improve YOLOv8 to
develop a new blood cell detection and classication frame-
work (NBCDC‐YOLOv8). Specically, we aim to enhance its
performance in complex data scenarios to ensure model
robustness and detection accuracy.
The primary contributions of this article are listed below.
● Development of NBCDC‐YOLOv8: We develop a cus-
tomised YOLOv8 framework tailored for blood cell
detection that addresses key challenges, such as cell overlap,
uneven distribution, and high density.
● Model Enhancements: The introduction of SPD‐Conv,
MultiSEAM, and bidirectional feature pyramid network
(BiFPN) improves the detection of small and overlapping
blood cells, thereby enhancing overall accuracy.
● State‐of‐the‐Art Performance: The proposed model ach-
ieves 94.7% mean average precision (mAP) on the mAP
dataset and thus surpasses existing methods in detection
precision.
● Real‐time Application: With fast inference speed, the pro-
posed model is well suited for real‐time blood cell analysis
in medical diagnostics.
2
|
PROPOSED METHOD
Currently, the mainstream algorithms for target detection fall
into two categories. First, we have two‐stage algorithms, which
are represented by R‐CNN [18] and Faster R‐CNN [19]. The
basic concept behind such approaches is to generate a set of
sparse candidate boxes through either a heuristic method (e.g. a
selective search) or a CNN network (e.g. an RPN). Subse-
quently, these candidate boxes are classied and rened. Sec-
ond, we have one‐stage methods, which are represented by
You Only Look Once (YOLO) [20–22] and SSD [23]. The
fundamental idea here is to uniformly and densely sample from
different areas in an image. Various proportions and aspect
ratios can be employed during the sampling process, and then a
CNN can be utilised to extract features and directly perform
classication and renement. This one‐step process results in
faster detection and better accuracy than the two‐stage
detection.
Given the need for fast and efcient blood cell detection
and classication in clinical settings, the present study opts for
a one‐stage algorithm. Specically, we choose YOLOv8 due to
its ability to balance detection speed and accuracy particularly
in complex scenarios involving overlapping and dense cell
structures.
2.1
|
The network structure of YOLOv8
The YOLO series algorithms have garnered considerable
attention due to their exceptional efciency and accuracy. In
2023, Ultralytics unveiled their most recent iteration of YOLO:
YOLOv8.
YOLOv8 enhances the connection of the backbone feature
extraction network to the detection head. To enhance feature
representation, especially when handling complex scenarios
and in small object detection. Additionally, YOLOv8 integrates
multi‐scale feature fusion technology by using an improved
feature pyramid network (FPN). This allows the model to
better combine features from different layers, thereby
improving localisation accuracy and classication ability. In
contrast, YOLOv5 relies on CSPDarknet53 as the backbone
and uses PANet for feature fusion. While YOLOv5 performs
well in terms of speed and accuracy, its multi‐scale feature
fusion is less efcient than YOLOv8's new architecture.
In summary, YOLOv8 has undergone signicant optimi-
sations in its network architecture that benet feature extrac-
tion and multi‐scale feature fusion. These improvements are
especially crucial for small object detection, such as platelet
detection tasks, where YOLOv8 can identify and differentiate
small and complex platelets more effectively than its
2 of 14
-
CHEN ET AL.
predecessors, demonstrating signicantly improved detection
accuracy. The YOLOv8 model maintains high detection ac-
curacy and speed, which makes it suitable for real‐time appli-
cations such as medical observation. Additionally, it exhibits
excellent robustness and scalability, which is why we adopt it as
the foundational model for further improvements.
Specically, as shown in Figure 1, the input terminal pre-
processes the input images through data enhancement, adap-
tive image scaling, and grayscale lling. Within the backbone
network, image features are extracted using convolution and
pooling techniques implemented through convolutional layer
(Conv), cross stage partial bottleneck with two convolutions
(C2f), and spatial pyramid pooling fusion (SPPF) structures.
The neck terminal is constructed based on a path aggregation
network (PAN), which merges feature maps with varying
scaling scales through up‐sampling, down‐sampling, and
feature stitching. The output utilises a decoupled head struc-
ture to separate the classication and regression processes.
This includes positive and negative sample matching, as well as
loss calculation. The YOLOv8 network employs the Task‐
aligned Assigner method [24] to assign weights to the classi-
cation score and regression score for positive sample
matching. The loss calculation encompasses both classication
loss and regression loss. To compute the classication loss, the
algorithm then uses binary cross‐entropy while regression loss
is calculated using distribution focal loss [25] (DFL) and
complete intersection over union loss functions.
2.2
|
Improved NBCDC‐YOLOv8
We select YOLOv8n as the baseline for our research due to its
low parameter count and fast detection speed. Figure 2shows
NBCDC‐YOLOv8's architecture featuring the new SPD‐Conv
block in its backbone that replaces traditional convolution and
pooling layers. Additionally, it incorporates BiFPN at the neck,
thereby enhancing the traditional FPN by enabling bi‐
directional information ow. NBCDC‐YOLOv8 also uses
MultiSEAM to boost recognition accuracy for small target
platelets.
2.2.1
|
Mosaic data augmentation
In the detection and classication of blood cells, there are
problems such as limited data sets and imbalanced categories,
which lead to poor generalisability, low accuracy, and the model
being prone to overtting. We can enhance the diversity of
samples and improve background complexity to enhance the
generalisability of DL models through various data augmen-
tation techniques [26]. In the case of the YOLOv8 network,
when dealing with datasets with insufcient samples, image
enhancement processing is applied during the model training
stage. The primary data augmentation method utilised by
YOLOv8 is Mosaic data augmentation [27]. The main idea is to
concatenate four blood cell images onto one image as a
training sample. Consequently, YOLOv8 can seamlessly
process a new image consisting of four blood cell images as a
batch input during training. These operations help alleviate the
overtting issues caused by the limited number of samples,
decrease GPU consumption, and enhance the robustness and
accuracy of the network.
Mosaic was initially introduced in YOLOv4 and is an
extension of the CutMix data augmentation algorithm [28]. Since
our main focus of blood cell detection and recognition is colour,
the original Mosaic data enhancement method is deemed inap-
propriate. Thus, we remove the gamut conversion module from
Mosaic. Moreover, to enhance the sample pool and enhance the
model's generalisability and robustness, we assemble four images
after ipping and zooming Figure 3illustrates the comparison
between the original mosaic data enhancement approach and the
enhanced one. Mosaic data augmentation effectively addresses
the challenges stemming from limited dataset samples and un-
even distribution of samples.
2.2.2
|
SPD‐Conv
In tasks demanding exceptional precision and sensitivity to
structural nuances, the performance of standard CNNs may
fall short. In contrast, SPD‐Conv [29] excels in extracting
pertinent features more efciently. In the context of blood cell
detection, SPD‐Conv emerged as a viable alternative to tradi-
tional convolution and pooling layers while maintaining a
similar parameter count. Its primary objective is to enhance the
detection accuracy of low‐resolution blood cell images and
minuscule platelet objects. By leveraging its procient matrix
processing capabilities, SPD‐Conv holds promise in facilitating
disease diagnosis and progress monitoring. This innovative
approach comprises two integral components: a space‐to‐
depth layer and a non‐strided convolution layer. The imple-
mentation of SPD‐Conv within the YOLOv8 model, coupled
with its validation on the BCCD dataset, underscores its ef-
cacy in blood cell detection.
For a feature map Xof any size (W,H,C) (where W=H),
its sub‐feature map sequence slice is shown in formula (1).
f0;0¼X½0:W:scale;0:W:scale�
f1;0¼X½1:W:scale;0:W:scale�
f0;1¼X½0:W:scale;1:W:scale�
f1;1¼X½1:W:scale;1:W:scale�
ð1Þ
For a given feature map Xof any size, sub‐map fx;ycon-
sists of all iþxand iþydivisible by scale in X(i,j). The
operation of SPD‐Conv is shown in Figure 4. First, the feature
map input as W�H�C is sampled, and four feature maps
with the size of W/2 �H/2 �C are obtained. Then, a feature
map with a size of W/2 �H/2 �4C is obtained by splicing.
Next, the dimension is reduced by a 1 �1 convolution, and
the correlation between the feature maps is constructed to
obtain a W/2 �H/2 �C feature map. It can be seen that
SPD‐Conv has higher information retention than the tradi-
tional convolution, and therefore it can effectively improve
feature extraction for small targets.
CHEN ET AL.
-
3 of 14
FIGURE 1 YOLOv8 network structure illustrating four main modules: an input terminal, backbone network, neck terminal, and output terminal.
FIGURE 2 NBCDC‐YOLOv8 network structure. Based on the YOLOv8 framework, the improved NBCDC‐YOLOv8 algorithm framework integrates a
bidirectional feature pyramid network (BiFPN) structure, uses SPD in the backbone, introduces MultiSEAM in the neck, and nally adds four detection heads to
the output terminal.
4 of 14
-
CHEN ET AL.
2.2.3
|
Bidirectional feature pyramid network
fused to middle YOLOv8
The neck of YOLOv8 contains multi‐scale feature fusion
technology. Initially, the neck down‐samples the input and then
up‐samples it. We replaced the C3 module with a C2f module,
which combines the feature maps from various stages of the
backbone to enhance representation. Specically, YOLOv8's
neck consists of an SPPF module, a PAA module, and two
PAN modules. However, in the aforementioned multi‐scale
feature fusion, the feature information at each scale is incon-
sistent. Improved FPNs such as PANet and NASFPN intro-
duce a signicant amount of computation or fail to effectively
fuse the features, thereby making it difcult to fully utilise the
features across different scales. To overcome this limitation,
our algorithm design is inspired by the efcient bidirectional
cross‐scale connection and the structure of the weighted
feature fusion BiFPN [30].
YOLOv8n has a PAN path aggregation structure, as
shown in Figure 5a. Although PAN utilises both bottom‐up
and top‐down feature transfer, it is only capable of merging
features from two levels. In contrast, BiFPN employs a
U‐Net‐like up‐and‐down sampling structure, which permits
the rapid integration of features across various levels.
Figure 5b shows the network structure of BiFPN. This
structure not only enhances feature fusion through multi‐
level integration but also utilises the up‐and‐down sam-
pling structure to reduce parameter calculation, and this
results in efcient image detection across different scales.
It is evident that PAN has a two‐way network structure
that enables information transmission vertically between the
top and bottom layers and horizontally within layers. However,
this information transmission mode remains relatively
simplistic and can only capture the characteristics of immediate
neighbouring nodes. In contrast, the BiFPN network structure
employed in this study can interact comprehensively with in-
formation from both horizontal and vertical information ow
paths. Through the introduction of cross‐layer jump connec-
tions, it further facilitates the exchange of information across
multiple paths. This integration leads to more comprehensive
feature map fusion, which then results in richer and more
accurate feature expressions. By combining bidirectional scale
connections and weighted features, this structure achieves a
better balance between accuracy and efciency.
Taking the P6 feature fusion depicted in Figure 5as an
illustration, we mathematically represent the structure of its
weighted bidirectional pyramid network in formula (2).
Ptd
6¼Conv w1⋅Pin
6þw2⋅ResizePin
7�
w1þw2þe!
Pout
6¼Conv w1'⋅Pin
6þw2'⋅Ptd
6þw3'⋅ResizePout
5�
w1'þw2'þw3'þe!
ð2Þ
We calculate feature graph Ptd
6as follows. Pin
6and Pin
7are the
input feature maps. The resize function is used to change the
spatial resolution of the feature maps, and w1 and w2 are the
weight coefcients used to weight the input feature maps. eis a
small number that prevents the denominator from being zero
and is often called a smooth or stable term. These weighted and
resized feature maps are weighted again and summed, and then
FIGURE 3 Comparison of data augmentation with the original Mosaic and our improved version: (a) the original Mosaic data expanded and adjusted in hue
(H), saturation (S), and brightness (V) using the HSV model; (b) the improved Mosaic data augmentation with a modied colour gamut conversion module.
FIGURE 4 Operation of SPD‐Conv. In SPD‐Conv, the input feature
map is rst transformed through the SPD layer, followed by convolution
through the Conv layer.
CHEN ET AL.
-
5 of 14
[x] is divided by the weights and smoothed over. Conv represents
the convolution operation, which is applied to the normalised
weighted feature graph and results in Ptd
6.
Next, we calculate another feature graph: Pout
6. We extend
formula (2) and set w1ʹ,w2ʹ, and w3ʹas new weight
coefcients. This time except for and (from the rst for-
mula), it also added, which is the output feature map of the
previous scale adjusted by the resize function. As in for-
mula (2), the weighted feature graph is summed and then
divided by the sum of the ownership weight plus the
smoothing term. Then, we perform the convolution opera-
tion again to get Pout
6.
The BIFPN structure facilitates continuation of this pro-
cess to achieve seamless transmission and fusion of informa-
tion across various levels of the feature pyramid. This enables
the generation of outputs that contain an extensive range of
semantic information.
2.2.4
|
Occlusion aware attention network—
MultiSEAM
The introduction of the MultiSEAM module marked a sig-
nicant advancement in the area of blood cell detection
because it addressed the complexities introduced by occlusions
among blood cell types. Occlusions can severely impact
detection accuracy by affecting the non‐maximum suppression
threshold, which in turn may lead to the non‐detection of cells;
this is especially common when cells of the same type occlude
each other and thus obscure key features and compromise cell
localisation.
MultiSEAM is a cutting‐edge neural network module
tailored for rening feature recognition and enhancement in
intricate image analysis. It is an evolution of the conventional
Separation and Enhancement Attention Module (SEAM) [31]
and is particularly adept at handling intricate image datasets.
The groundbreaking aspect of MultiSEAM lies in its capacity
to simultaneously handle and integrate image features spanning
diverse dimensions ranging from nuanced local details to
sweeping macro structures.
In the context of blood cell analysis, MultiSEAM excels at
precisely segmenting a wide array of cell types through the
seamless fusion of multi‐scale features. This is achieved using
adaptive average pooling and a fully connected layer. Figure 6
presents the architecture of MultiSEAM. Here, we can see that
the channel and spatial mixing module employs patch embed-
ding to divide a large word vector matrix into smaller matrices
for processing. A GELU activation function and depthwise
convolution are used for the convolution operation. This is
followed by a point‐to‐point convolution to enhance informa-
tion transmission between channels and improve network con-
nectivity. To improve model effectiveness in dealing with blood
cell occlusion, the output of the MultiSEAM module is multi-
plied by the original feature as the attention weight. This step
aims to compensate for potential information loss in occluded
scenes by learning the relationship between occluded and non‐
occluded blood cells.
2.3
|
Blood cell detection and classication
method
2.3.1
|
Dataset description
Both the training set and test set used in this experiment are
crafted from the BCCD, which is a limited‐size public dataset of
human blood cell images. BCCD contains 364 images of
different types of blood cells from peripheral blood. All blood
cell training set images contained a total of 9776 cell labels: 8310
red cell labels, 744 white cell labels, and 722 platelet labels. The
BCCD contains a small number of white blood cells and platelets
but a large number of red blood cells. It is difcult to detect these
cells due to overlapping cells, adhesion of cells, broken red blood
cells, and small target platelets. Since our algorithm uses Mosaic
FIGURE 5 Comparison of bidirectional feature pyramid network (BiFPN) and path aggregation network (PAN) network structures. (a) Network structure
of BiFPN: the downward arrow denotes a top‐down path, which transmits the semantic information of high‐level features; the upward arrow denotes a bottom‐
up path that transmits the location information of low‐level features; the curved arrow denotes a cross‐scale connection. By adding a jump connection and two‐
way path, the weighted fusion and two‐way cross‐scale connection are realised. (b) PAN network structure, which utilises top‐down and bottom‐up feature
transfer.
6 of 14
-
CHEN ET AL.
data enhancement at the input end, we do not carry out data
amplication in the data pre‐processing stage.
2.3.2
|
Model use
Figure 7presents a owchart of the proposed method, which
consists of several key steps. These steps outline the process
from data preparation to model evaluation and are designed to
optimise the detection and classication of blood cells. A
detailed description of each step is provided below.
● Step 1: In this initial step, the images are loaded into a
dataset, which serves as the foundation for subsequent
processing. The loaded images are separated into two sets: a
training set and a test set.
● Step 2: We set up the YOLOv8 architecture to form the
basis of our proposed framework.
● Step 3: Several modications are made to the standard
YOLOv8 to enhance performance for blood cell detection.
Key changes are made in the following sub‐steps.
● Step 3.1: Several standard convolutional layers within the
YOLOv8 backbone are replaced with SPD‐Conv layers.
● Step 3.2: The original neck network of YOLOv8 is
substituted with a BiFPN.
● Step 3.3: We add SEAM and MultiSEAM to the neck
network.
● Step 4: Here, the parameters of the NBCDC‐YOLOv8
model are ne‐tuned through comprehensive training.
Key parameters, including learning rate, batch size, and
optimiser, are optimised to achieve the best performance.
Details of the parameter settings used for tuning are dis-
cussed in section 3.2.
● Step 5: The NBCDC‐YOLOv8 model is trained using the
training dataset, and its performance is evaluated using
metrics such as average precision (AP), mAP, and the
precision–recall curve. The efciency of the proposed
framework is assessed in terms of the number of model
parameters. Detailed evaluations are presented in sec-
tions 4.2 and 4.3.
● Step 6: Lastly, we present the visual results obtained from
the proposed method. Detailed descriptions are provided in
section 4.6.
3
|
EXPERIMENTAL SETUP
3.1
|
Experimental environment
Our experiments were conducted in a controlled hardware and
software environment. Table 1provides a detailed breakdown
of the hardware and software used in the experiments.
3.2
|
Parameter setting
The key parameters used during model training and optimi-
sation are listed in Table 2. These parameters were selected and
ne‐tuned to achieve optimal performance for blood cell
detection and classication.
3.3
|
Evaluation metrics
To evaluate the performance of the NBCDC‐YOLOv8 model,
several key metrics were used: precision, recall, AP, mAP, and
frames per second (FPS).
The precision rate refers to the ratio of the correctly pre-
dicted cells (the proportion (TP)) to the total number of cor-
rect results (TP þFP) classied as positive samples in the
blood cell detection task. Precision is calculated as shown in
equation (3).
PðPrecisionÞ ¼ TP=ðTP þFPÞ ð3Þ
Recall signies TP of blood cells (TP þFP) that should be
detected when the true category of blood cells is positive and
the nal predicted result is also positive. Recall is calculated as
shown in equation (4).
FIGURE 6 MultiSEAM structure. Left:
architecture of MultiSEAM. Right: architecture of
MultiSEAM's CSMM. FC stands for full
connection. Through CSMM, depthwise
convolution is performed. Then, after average
pooling, [x] is connected to a fully connected layer.
The fully connected layer reassembles all local
features extracted from the convolutional and
pooling layers into a complete graph through a
weight matrix. CSMM, channel and spatial mixing
module.
CHEN ET AL.
-
7 of 14
RðRecallÞ ¼ TP=ðTP þFNÞ ð4Þ
We calculate the AP of a single‐category model. Generally,
a higher AP indicates a more effective classier. Average pre-
cision is expressed as in equation (5).
AP ¼Z1
0
PRdR ð5Þ
The mAP measures our algorithm's performance in pre-
dicting the location and category of cells. Mean average pre-
cision represents the average of multiple AP values across
different categories. The mAP value should fall within the
range of 0–1, and a higher value indicates better performance.
Mean average precision is calculated as in equation (6). Here, n
represents the total number of categories.
mAP ¼X
n
i¼0
APi=nð6Þ
Frames per second serves as a metric to evaluate the speed
of image processing or model inference. It is expressed as in
equation (7).
FPS ¼1=Processing time per frame ð7Þ
Note that in the above formulae TP stands for the number
of true positives, FP stands for the number of false positives,
FN stands for the number of false negatives, nis the number
of categories, and APiis the average precision of category i.
4
|
EXPERIMENTAL RESULTS
4.1
|
Confusion matrix analysis
Confusion matrices are foundational for evaluating the perfor-
mance of classication models. They enable the calculation of
various crucial performance indicators to quantify a models'
efcacy in different categories. Figure 8shows the confusion
matrix for NBCDC‐YOLOv8. The model performs excep-
tionally well in classifying WBC and platelets: in both cases it
FIGURE 7 Flow chart of the proposed method.
TABLE 1Experimental platform conguration overview.
Component Description
Operating system Ubuntu 20.04
GPU NVIDIA RTX 4090
CPU Intel(R) Xeon(R) Platinum 8352V CPU @
2.10 GHz
Memory 90 GB
Accelerated
environment
CUDA 12.2
Programming language 3.8.10
Deep learning
framework
PyTorch 2.0
TABLE 2Key parameter conguration.
Parameter Value
Learning rate (initial) 0.01
Epochs 100
Batch size 48
Weight decay 0.0005
Class 3
Optimiser SGD
8 of 14
-
CHEN ET AL.
achieves an accuracy of 1.00. However, there is a small
misclassication rate where 1% of WBCs and 4% of platelets are
mistakenly classied as background. For RBC, the accuracy is
slightly lower at 0.92; here, 8% of RBCs are incorrectly classied
as background and 5% are misclassied as WBCs. In summary,
our model performs near‐perfectly in classifying WBCs and
platelets, and we observe a slight drop in classication accuracy
when classifying RBCs as some are misclassied as background
or WBCs. These results highlight the model's robustness but also
point out areas for improvement in RBC classication.
4.2
|
PR curve analysis
Figure 9presents the precision–recall curves for each blood
cell type. The proposed model performs exceptionally well in
detecting WBCs: it has a precision of 0.988 and consistently
high recall, which together indicate minimal false positives. In
RBC detection, our model has a precision of 0.897. Precision
decreases at higher recall levels, and thus there is room for
improvement. Lastly, for platelet detection, our model shows
strong and stable performance with a precision of 0.955.
Overall, the model achieves an impressive mAP@0.5 of 0.947.
This result showcases its effectiveness, particularly in detecting
small, dense, and overlapping blood cells.
4.3
|
Comparative experiment
Here we compare the improved NBCDC‐YOLOv8 model
with several mainstream one‐stage and two‐stage target
detection models including Faster R‐CNN, SSD, YOLOv3,
YOLOv4, YOLOv5, and YOLOv8. To ensure fair compari-
son, all experiments were conducted under the same condi-
tions and using an intersection over union (IOU) threshold of
0.5. As demonstrated in Table 3, our proposed model achieves
notable enhancements in both detection accuracy and mAP
over the other models.
The NBCDC‐YOLOv8 model achieves an mAP up to
94.7%, which highlights its exceptional ability to accurately
detect and classify various types of blood cells across all tested
datasets. The overall detection rate also sees substantial im-
provements that would make it highly effective in scenarios
where precise classication is required.
Despite these advancements, there are slight trade‐offs in
terms of recall (R) and speed (FPS) when compared to some of
the other models (e.g. YOLOv5s and YOLOv8n). The slightly
lower recall in NBCDC‐YOLOv8 is due to its design, which
prioritises precision and is especially important for detecting
small and complex objects such as platelets. Moreover, the
integration of advanced modules such as BiFPN and SPD
layers enhances feature extraction for small targets but makes
the model more conservative, thereby leading to slightly lower
recall. In terms of speed, NBCDC‐YOLOv8 shows slower
FIGURE 8 Confusion Matrix of NBCDC‐YOLOv8. The confusion
matrix shows the true category and the category predicted by the
classication model.
FIGURE 9 PR curve.
CHEN ET AL.
-
9 of 14
inference due to its more complex architecture and 466 layers
than YOLOv8's 225 layers. The inclusion of MultiSEAM and
rened feature extraction layers boosts accuracy but increases
computational load, thus reducing FPS. While lightweight
models such as YOLOv8n focus on speed, NBCDC‐YOLOv8
sacrices some speed for improved detection performance but
still achieves a real‐time FPS of 98.69.
In conclusion, although NBCDC‐YOLOv8 demonstrates
slightly lower recall and speed than some lighter models, its
ability to deliver higher precision and mAP makes it the most
balanced and effective model for blood cell detection. The
trade‐off between speed and accuracy is a result of the added
complexity aimed at improving detection performance partic-
ularly for smaller or more challenging objects. As a result,
NBCDC‐YOLOv8 offers a comprehensive performance
advantage while balancing accuracy and real‐time capabilities.
4.4
|
Ablation experiment
As shown in Table 4, the NBCDC‐YOLOv8 model demon-
strates outstanding performance in terms of precision (88.0%)
and mAP (94.7%). However, its recall rate (92.5%) is slightly
lower than that of some other congurations; for example,
YOLOv8 þSPD and YOLOv8 þMultiSEAM achieve recall
rates of 96.8% and 95.0%, respectively. The decrease in recall
rate is because NBCDC‐YOLOv8 integrates multiple
advanced components (including a BiFPN, MultiSEAM, SPD,
and four detection heads), which signicantly improve feature
extraction and object localisation. These enhancements are
designed to increase the model's precision, especially in
detecting smaller and more challenging objects such as plate-
lets. However, such a focus on precision can sometimes lead to
more conservative detection behaviour, where the model
TABLE 3Performance comparison of
NBCDC‐YOLOv8 and other models.
Model p(%) R (%) Class AP (%) Speed (ms) FPS mAP (%)
Faster R‐CNN 70.11 88.99 RBC 83.44 16.71 45.54 84.38
WBC 97.56
Platelets 72.14
SSD 80.94 67.50 RBC 76.33 7.25 137.78 82.39
WBC 97.33
Platelets 73.53
YOLOv3 84.33 24.85 RBC 70.43 10.80 92.57 50.52
WBC 68.34
Platelets 12.78
YOLOv4 68.75 20.32 RBC 45.39 16.70 59.82 34.31
WBC 31.81
Platelets 25.73
YOLOv5s 86.50 90.50 RBC 84.60 11.70 85.47 91.00
WBC 98.20
Platelets 90.20
YOLOv8n 85.50 92.70 RBC 86.60 17.30 51.28 92.40
WBC 98.70
Platelets 91.90
YOLOv8s 87.10 92.70 RBC 88.30 18.10 55.25 94.20
WBC 99.40
Platelets 94.90
YOLOv8m 86.60 92.40 RBC 87.40 17.90 55.87 93.20
WBC 99.10
Platelets 93.20
NBCDC‐YOLOv8 (ours) 88.00 92.50 RBC 89.70 10.10 98.69 94.70
WBC 98.90
Platelets 95.50
Note: Bold values are used to emphasize key results of our model and highlight its signicance.
10 of 14
-
CHEN ET AL.
prioritises avoiding false positives and thus potentially misses
some true positives in the process. This trade‐off between
precision and recall is common in highly accurate models,
where the purpose is to minimise misclassication at the
expense of slightly reduced recall.
In conclusion, while NBCDC‐YOLOv8 achieves superior
performance in terms of precision and mAP, the slightly lower
recall is the result of the model's design. By focussing on
maximising precision and improving small object detection,
the model sacrices some recall in favour of reducing false
positives. Despite this trade‐off, the overall performance of
NBCDC‐YOLOv8 remains highly competitive and makes it
the most balanced model for blood cell detection tasks, where
both precision and real‐time detection are critical.
4.5
|
Model generalisability
To thoroughly evaluate the generalisability of the proposed
NBCDC‐YOLOv8 model, we tested it on multiple datasets
with varying characteristics.
The BCCD is a widely used blood cell classication dataset
and serves as the primary training set. Here, there is a focus on
common blood cell types. Our model achieved its highest mAP
(94.7%) on this dataset, which indicates strong performance in
familiar conditions.
The CBC dataset contains blood cell images collected from a
different medical institution, and it feature variations in cell
morphology and imaging conditions. When tested on the CBC
dataset, the model maintained a high precision (90.2%) and mAP
(93.8%). Although there was a slight drop in recall and mAP
compared to when employed on the BCCD, the model gener-
alised well across datasets with moderate variations in
conditions.
We generated a synthetic dataset by applying trans-
formations to existing images to simulate real‐world challenges
common in blood microscopy images. This included adding
noise, blurring, occlusion, and overlapping effects. On this
dataset, our model achieved a precision of 86.0% but showed
signicant drops in recall (56.4%) and mAP (73.0%). This
suggests that the model struggles to handle highly complex and
noisy environments and it missed a large proportion of true
positives. Therefore, there is need for further optimisation to
such conditions.
The Leukaemia Dataset is a specialised dataset containing
images of blood cells affected by leukaemia and it tests a
model's ability to detect abnormal cell types. On this dataset,
our model's precision decreased to 82.0% and its mAP
decreased to 70.3%. We also observed a signicant drop in
recall (56.4%). This lower performance suggests that the model
requires further improvements for detecting rare or abnormal
cell types.
Table 5summarises the cross‐dataset performance of
NBCDC‐YOLOv8. The model's slightly reduced performance
on the CBC and synthetic datasets indicates that, while it
generalises reasonably well across datasets with different
characteristics, further ne‐tuning is needed to handle more
complex or abnormal conditions effectively. This highlights the
model's strengths in detecting common blood cell types and its
potential to be improved for more challenging scenarios.
4.6
|
Qualitative analysis of blood cell
detection
Figure 10 qualitatively analyses the performance of NBCDC‐
YOLOv8 when faced with three key challenges: uneven distri-
bution, high density, and mutual occlusion of blood cells. For
uneven distribution, as shown in Figure 10a, where white blood
cells are isolated and red blood cells are sparsely distributed,
YOLOv8 (Figure 10) struggles with missed and false detections;
this is particularly the case in the lower sparse regions. In
contrast, NBCDC‐YOLOv8 (Figure 10c) demonstrates more
accurate detection due to its multi‐scale feature fusion capability.
TABLE 4Ablation experiment.
Model Precision Recall mAP@0.5
YOLOv8 81.8 94.2 91.6
YOLOv8 þSPD 82.2 96.8 94.0
YOLOv8 þMultiSEAM 84.6 95.0 94.1
YOLOv8 þDetectHead �4 87.5 91.5 93.7
YOLOv8 þBiFPN 86.9 93.3 94.4
YOLOv8 þBiFPN þMultiSEAM 86.4 90.7 92.1
YOLOv8 þBiFPN þMultiSEAM þSPD 84.6 94.1 93.7
YOLOv8 þBiFPN þMultiSEAM þSPD þDetectHead £4(NBCDC‐YOLOv8) 88.0 92.5 94.7
Note: Bold values are used to emphasize key results of our model and highlight its signicance.
TABLE 5Comparison of results across different datasets.
Dataset Precision Recall mAP@0.5
BCCD 88.0 92.5 94.7
CBC 90.2 88.6 93.8
Synthetic 80.7 52.3 55.0
Leukaemia 56.4 88.0 70.3
CHEN ET AL.
-
11 of 14
FIGURE 10 Comparison of NBCDC‐YOLOv8 network's improvement on false detection and missed detection of adherent red blood cells. (a) Original
image, (b) image produced by YOLOv8, and (c) image produced by NBCDC‐YOLOv8.
FIGURE 11 Loss curve: The abscissa represents the number of iterations, while the ordinate represents the loss value. This demonstrates the variation
process of the loss function during model training.
For high‐density regions, particularly in the lower right corner of
Figure 10b, YOLOv8 exhibits inaccurate bounding boxes and
false positives when detecting closely packed red blood cells.
However, NBCDC‐YOLOv8 (Figure 10c) effectively distin-
guishes adjacent cells with greater precision, thereby minimising
false positives. Lastly, in the case of mutual occlusion YOLOv8
shows incomplete bounding boxes and missed detections.
NBCDC‐YOLOv8 signicantly improves detection accuracy by
correctly identifying overlapping cells and resolving occlusions.
These qualitative results illustrate NBCDC‐YOLOv8's superior
performance in addressing the challenges posed by uneven
distribution, high density, and mutual occlusion in blood cell
detection.
5
|
DISCUSSION
To verify the convergence effect of the proposed algorithm, we
present three blood cell loss curves of the improved NBCDC‐
YOLOv8 in Figure 11. It can be observed that the loss value
decreases gradually and tends to stabilise when the number of
iterations reaches approximately 50. This suggests that the al-
gorithm can converge rapidly, which indicates high training
efciency.
Table 6compares the model and its accuracy from the
present study with those from previous studies. The proposed
NBCDC‐YOLOv8 model notably outperforms previous cell
detection techniques with its impressive AP of 94.70%.
The NBCDC‐YOLOv8 model offers signicant improve-
ments in terms of precision and mAP while maintaining real‐
time detection capabilities. This makes it especially useful in
detecting smaller objects such as platelets. However, it dem-
onstrates a slight reduction in recall due to its focus on mini-
mising false positives, and its increased computational
complexity may restrict its functionality in resource‐poor en-
vironments. Future research could focus on optimising the
model to improve recall without sacricing precision and
simplifying the architecture to reduce computational demands.
Such changes would make the model more suitable for a
broader range of practical applications.
12 of 14
-
CHEN ET AL.
6
|
CONCLUSIONS
This paper presents the NBCDC‐YOLOv8 algorithm for
real‐time, efcient blood cell detection. It addresses challenges
such as variable blood cell scale, limited data, dense occlusion,
and low accuracy from erythrocyte adhesion. To compensate
for having limited data, we apply Mosaic data enhancement
and splicing four images to enrich data characteristics. We
then introduce SPD‐Conv to process low‐resolution blood
microscope images effectively. To handle scale variations and
adhesion, we employ a BiFPN‐inspired feature fusion struc-
ture that enhances multi‐scale feature fusion and improves
feature expression of red blood cells. Next, we introduce a
MultiSEAM module to accurately position densely packed
cells, thereby improving detection accuracy. For small target
detection, four detection heads are added to increase platelet
recognition. The algorithm achieved a mAP of 88.8% for red
blood cells, 98.6% for white blood cells, and 89.5% for
platelets. Moreover, its detection rate on the BCCD was
94.7%, which surpassed that of previous methods. Using the
CBC dataset to further test the model, the mAP reached
93.8%, thus indicating that the model has good general-
isability and robustness. Future work will explore diverse cell
data for clinical applications and broad use in medical image
analysis.
AUTHOR CONTRIBUTIONS
Xuan Chen: Conceptualization; Data curation; Formal analysis;
Methodology; Software; Writing—original draft; Writing—re-
view & editing. Linxuan Li: Conceptualization; Data curation;
Formal analysis; Visualization. Xiaoyu Liu: Investigation;
Validation. Fengjuan Yin: Supervision; Validation; Visualiza-
tion; Writing—review & editing. Xue Liu: Project administra-
tion; Supervision. Xiaoxiao Zhu: Supervision; Validation.
Yufeng Wang: Funding acquisition; Resources; Supervision.
Fanbin Meng: Conceptualization; Resources; Supervision.
ACKNOWLEDGEMENTS
The authors are grateful for the support and guidance from
Jining Medical University. This work was supported by the Jining
Medical University Classroom Teaching Reform Research
Project (Grant No. 2022KT012) and the Innovation and
Entrepreneurship Training Programme for College Students
(Grant Nos. 202210443002, 202210443003, S202310443006,
cx2023094z and cx2022044z).
CONFLICT OF INTEREST STATEMENT
The authors declare no conicts of interest.
DATA AVAILABILITY STATEMENT
You can access our research code at the following GitHub re-
pository: https://github.com/FanbinMeng‐Group/NBCDC.git.
ORCID
Xuan Chen
https://orcid.org/0009-0002-0129-4832
REFERENCES
1. Roland, L., Drillich, M., Iwersen, M.: Hematology as a diagnostic tool in
bovine medicine. J. Vet. Diagn. Invest. 26(5), 592–598 (2014). https://
doi.org/10.1177/1040638714546490
2. Atkins, C.G., et al.: Raman spectroscopy of blood and blood compo-
nents. Appl. Spectrosc.: Soc. Appl. Spectrosc. 71(5), 767–793 (2017).
https://doi.org/10.1177/0003702816686593
3. Hussein, S., et al.: Automatic segmentation and quantication of white
and Brown adipose tissues from PET/CT scans. IEEE Trans. Med.
Imag. 36(3), 734–744 (2017). https://doi.org/10.1155/2014/979302
4. Chabot‐Richards, D.S., George, T.I.: White blood cell counts reference
methodology. Clin. Lab. Med. 35(1), 11–24 (2015). https://doi.org/10.
3389/fmed.2018.00084/full
5. Garraud, O., Tissot, J.D.: Blood and blood components: from similarities
to differences. Front. Med. 5, 84 (2018). https://doi.org/10.3389/fmed.
2018.00084
6. Borland, D., et al.: Segmentor: a tool for manual renement of 3D mi-
croscopy annotations. BMC Bioinf. 22(1):260 (2021). https://doi.org/10.
1186/s12859‐021‐04202‐8
7. Acharjee, S., et al.: A Semiautomated Approach Using GUI for the
Detection of Red Blood Cells, pp. 525–529 (2016)
8. Asghar, R., Kumar, S., Mahfooz, A.: Classication of Blood Cells Using
Deep Learning Models (2023)
9. Alam, M.M., Islam, M.T.: Machine learning approach of automatic
identication and counting of blood cells. Healthc. Technol. Lett. 6(4),
103–108 (2019). https://doi.org/10.1049/htl.2018.5098
10. Khan, S., et al.: A review on traditional machine learning and deep
learning models for WBCs classication in blood smear images. IEEE
Access 9, 10657–10673 (2021). https://doi.org/10.1109/ACCESS.2020.
3048172
TABLE 6Comparison with
state‐of‐the‐art methods. Model Year Dataset mAP (%)
Faster‐RCNN [19] 2017 BCCD 76.50
CycleGAN [32] 2020 BCCD 83.90
ISE‐YOLO [33] 2021 BCCD 85.70
Integrated attention mechanism and YOLOv5 [34] 2024 BCCD 89.60
Fast and efcient YOLOv3 [35] 2021 BCCD 89.86
CNN architecture based on YOLO [36] 2022 BCCD 91.13
TE‐YOLOF‐B [37] 2021 BCCD 91.90
AYOLOv5 [38] 2024 BCCD 93.30
NBCDC‐YOLOv8(ours) ‐BCCD 94.70
Note: Bold values are used to emphasize key results of our model and highlight its signicance.
CHEN ET AL.
-
13 of 14
11. Das, P.K., et al.: A systematic review on recent advancements in deep and
machine learning based detection and classication of acute lymphoblastic
leukemia. IEEE Access 10, 81741–81763 (2022). https://doi.org/10.
1109/ACCESS.2022.3196037
12. Mulya, R.F., Utami, E., Ariatmanto, D.: Classication of acute lympho-
blastic leukemia based on white blood cell images using InceptionV3
model. J. RESTI (Rekayasa Sistem dan Teknologi Informasi) 7(4), 947–
952 (2023). https://doi.org/10.29207/resti.v7i4.5182
13. Rahat, I., et al.: A step towards automated haematology: DL models for
blood cell detection and classication. EAI Endorsed Trans. Pervasive
Health Technol. 10 (2024). https://doi.org/10.4108/eetpht.10.5477
14. Zhang, D., Zhang, P., Wang, L.: Cell Counting Algorithm Based on
YOLOv3 and Image Density Estimation, pp. 920–924 (2019)
15. Moallem, G., et al.: Detecting and Segmenting Overlapping Red Blood
Cells in Microscopic Images of Thin Blood Smears (2018)
16. Shahzad, M., et al.: Blood cell image segmentation and classication: a
systematic review. PeerJ Comput. Sci. 10, e1813 (2024). https://doi.org/
10.7717/peerj‐cs.1813
17. Cheng, W., et al.: Application of image recognition technology in path-
ological diagnosis of blood smears. Clin. Exp. Med. 24(1):181 (2024).
https://doi.org/10.1007/s10238‐024‐01379‐z
18. Girshick, R., et al.: Rich feature hierarchies for accurate object detection
and semantic segmentation. IEEE Comput. Soc., 580–587 (2014).
https://doi.org/10.1109/CVPR.2014.81
19. Ren, S., et al.: Faster R‐CNN: towards real‐time object detection with
region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6),
1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
20. Redmon, J., et al.: You only Look once: unied, real‐time object detec-
tion. In: 2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 779–788 (2015). https://doi.org/10.1109/
CVPR.2016.91
21. Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger, pp. 6517–
6525 (2017)
22. Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement (2018)
arXiv e‐prints;abs/1804.02767:6. https://doi.org/10.48550/arXiv.1804.
02767
23. Berg, A.C., et al.: SSD: Single Shot MultiBox Detector, vol. 9905, pp. 21–
37 (2015). https://doi.org/10.1007/978‐3‐319‐46448‐0_2
24. Feng, C., et al.: TOOD: Task‐Aligned One‐Stage Object Detection, pp.
3490–3499 (2021). https://doi.org/10.1109/ICCV48922.2021.00349
25. Li, X., et al.: Generalized Focal Loss: Learning Qualied and Distributed
Bounding Boxes for Dense Object Detection, pp. 21002–21012 (2020)
ArXiv;abs/2006.04388. https://doi.org/10.48550/arXiv.2006.04388
26. Hu, B., et al.: A Preliminary Study on Data Augmentation of Deep
Learning for Image Classication, pp. 117–122 (2019). https://doi.org/
10.1145/3361242.3361259
27. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: Optimal Speed
and Accuracy of Object Detection (2020). abs/2004:10934. https://doi.
org/10.48550/arXiv.2004.10934
28. Yun, S., et al.: CutMix: Regularization Strategy to Train Strong Classiers
with Localizable Features, pp. 6022–6031 (2019)
29. Sunkara, R., Luo, T.: No More Strided Convolutions or Pooling: A New
CNN Building Block for Low‐Resolution Images and Small Objects (2022)
30. Tan, M., Pang, R., Le, Q.V.: EfcientDet: Scalable and Efcient Object
Detection, pp. 10778–10787 (2019). https://doi.org/10.48550/arxiv.
1911.09070
31. Yu, Z., et al.: YOLO‐FaceV2: A Scale and Occlusion Aware Face De-
tector, 18 (2022). ArXiv;abs/2208.02019. https://doi.org/10.48550/
arXiv.2208.02019
32. He, J., et al.: CycleGAN with an improved loss function for cell detection
using partly labeled images. IEEE J. Biomed. Health Inform. 24(9),
2473–2480 (2020). https://doi.org/10.1109/JBHI.2020.2970091
33. Liu, C., et al.: Improved squeeze‐and‐excitation attention module based
YOLO for. Blood Cells Detection, 3911–3916 (2021)
34. Shahin, O.R., et al.: Optimized automated blood cells analysis using
Enhanced Greywolf Optimization with integrated attention mechanism
and YOLOv5. Alex. Eng. J. 109, 58–70 (2024). https://doi.org/10.1016/
j.aej.2024.08.054
35. Shakarami, A., et al.: A fast and yet efcient YOLOv3 for blood cell
detection. Biomed. Signal Process Control. 66, 66 (2021). https://doi.
org/10.1016/j.bspc.2021.102495
36. Amudhan, A.N., et al.: RFSOD: a lightweight single‐stage detector for
real‐time embedded applications to detect small‐size objects. J. Real‐Time
Image Process 19(1), 133–146 (2022). https://doi.org/10.1007/s11554‐
021‐01170‐3
37. Xu, F., et al.: TE‐YOLOF: tiny and efcient YOLOF for blood cell
detection. Biomed. Signal Process Control 73, 103416 (2022). https://
doi.org/10.1007/s11554‐021‐01170‐3
38. Gu, W., Sun, K.: AYOLOv5: improved YOLOv5 based on attention
mechanism for blood cell detection. Biomed. Signal Process Control 88:
105034 (2024). https://doi.org/10.1016/j.bspc.2023.105034
SUPPORTING INFORMATION
Additional supporting information can be found online in the
Supporting Information section at the end of this article.
How to cite this article: Chen, X., et al.: NBCDC‐
YOLOv8: a new framework to improve blood cell
detection and classication based on YOLOv8. IET
Comput. Vis. e12341 (2025). https://doi.org/10.1049/
cvi2.12341
14 of 14
-
CHEN ET AL.