Conference PaperPDF Available

Face Mask Detection using Convolutional Neural Network (CNN) to reduce the spread of Covid-19

Authors:
Face Mask Detection using Convolutional Neural
Network (CNN) to reduce the spread of Covid-19
F.M. Javed Mehedi Shamrat
Department of Software Engineering
Daffodil International University
Dhaka, Bangladesh
javedmehedicom@gmail.com
Md. Masum Billah
Department of Software Engineering
Daffodil International University
Dhaka, Bangladesh
masum.swe.ndc@gmail.com
Md Saidul Islam
Department of Computer Science and Engineering
Jiangsu University of Science and Technology
Jiangsu, China
roney.orcl@gmail.com
Sovon Chakraborty
Department of Computer Science and Engineering
European University of Bangladesh
Dhaka, Bangladesh
sovonchakraborty2014@gmail.com
Md. Al Jubair
Department of Computer Science and Engineering
European University of Bangladesh
Dhaka, Bangladesh
jubair@eub.edu.bd
Rumesh Ranjan*
Department of Plant Breeding and Genetics
Punjab Agriculture University
Punjab, India
rumeshranjan@pau.edu
Abstract The COVID-19 coronavirus pandemic is wreaking
havoc on the world's health. The healthcare sector is in a state of
disaster. Many precautionary steps have been taken to prevent the
spread of this disease, including the usage of a mask, which is
strongly recommended by the World Health Organization
(WHO). In this paper, we used three deep learning methods for
face mask detection, including Max pooling, Average pooling, and
MobileNetV2 architecture, and showed the methods detection
accuracy. A dataset containing 1845 images from various sources
and 120 co-author pictures taken with a webcam and a mobile
phone camera is used to train a deep learning architecture. The
Max pooling achieved 96.49% training accuracy and validation
accuracy is 98.67%. Besides, the Average pooling achieved 95.19%
training accuracy and validation accuracy is 96.23%.
MobileNetV2 architecture gained the highest accuracy 99.72% for
training and 99.82% for validation.
Keywordsface mask; max pooling; covid-19; average pooling;
mask detection; MobileNetV2; CNN
I. INTRODUCTION
The term "novel coronavirus" refers to a modern type of
coronavirus that has never been observed in humans before.
Coronaviruses are a form of the virus that can trigger a variety
of illnesses, from colds to life-threatening infections including
Middle East Respiratory Syndrome to Severe Acute Respiratory
Syndrome [1]. In December of this year, the first coronavirus-
infected patient was discovered. COVID-19 has been a
worldwide pandemic since that time [2]. Humans all around the
world are in precarious conditions as a consequence of the
pandemic. Every day, a huge amount of people become
contaminated with the disease and suffer as a result of it. At the
time of publication, almost 16,207,130 contaminated cases had
been reported, with 648,513 dead [3]. This statistic is gradually
growing. According to the World Health Organization (WHO),
the most frequent signs of coronavirus are fever, dry cough,
exhaustion, diarrhea, loss of taste, and smell [4]. Many
researchers and developers are working with diseases for several
years using machine learning and deep learning [5-9].
Jiang et al. [10] suggest Retina Facemask, a paradigm for
detecting the face mask that combines it with a bridge entity
elimination algorithm. The developed model includes a single-
stage detector that uses a feature pyramid network to achieve
slightly better precision and recall than the baseline result. To
address the lack of datasets, they used a learning algorithm [11],
well deep learning [12-16] methodology. Gupta et al. [17]
suggested a model implement social distance utilizing smart
communities and Intelligent Transportation Systems during the
COVID-19 pandemic (ITS). Their model called for the
installation of sensors in the city to monitor the movement of
objects in real-time, as well as the development of a data-sharing
network. Won Sonn and Lee [18] clarify how a smart city will
aid in the control of coronavirus spread in South Korea. A time-
space cartographer sped up the city's communication
monitoring, which included patient movement, transaction
background, mobile phone use, and cell phone position. CCTV
cameras in residential building hallways have been monitored in
real-time.
5th International Conference on Trends in Electronics and Informatics (ICOEI 2021)
Tirunelveli, India, 3-5, June 2021
Pre-Print
In the paper [19-22], M. Loey et al. showed the performance
of different machine learning algorithms in detecting face masks
and various purposes. In this study, three datasets are used for
feature extraction using ResNet50. For the classification
process, the decision tree algorithm, support vector machine,
and ensemble algorithms are used that gave high detection
accuracy on each dataset.
The main objective of the paper [23] is to detect a person
without a face mask and informing the authority to reduce the
spread of COVID-19. The image used in the process is captured
by CCTV cameras. After preprocessing the data, feature
extraction and classification are done using CNN. The trained
model shows an accuracy of 98.7%. The authors in the paper
[24] designed a binary face classifier to detect faces irrespective
of their alignment. In detect masks in arbitrary size input image
VGG 16 Architecture is used for feature extraction [25]. In this
work, Gradient Descent is used for training the dataset while
Binomial Cross-Entropy is used as a loss function. M.S. Ejaz et
al. in [26], has implemented PCA for masked and non-masked
facial image detection. Viola-Jones algorithm is used in the
paper to detect face portion and at the same time, PCA to
compute Eigenface and the nearest neighbor (NN) classifier
distance is used for face recognition.
The rest of the document is formatted in the same way. This
sector consists of the most current developments in the field of
facial mask detection. The analysis technique for designing the
whole structure is outlined in Section II. Section III examines
the outcomes of the framework that has been created. Section IV
concludes with a hypothesis and shortcomings, as well as
suggestions for future work.
II. RESEARCH METHODOLOGY
CNN are a kind of deep neural network which is typically
used in deep learning to examine visual imagery. A CNN is a
Deep Learning algorithm that would take an image as input,
assign meaning to different parts of the image, and differentiate
between them. Because of their high precision, CNNs are used
for image detection [27] and identification. The CNN uses a
hierarchical model that builds a network in the shape of a funnel
and then outputs a fully-connected layer where all the neurons
are connected to each other and the data is stored. Artificial
Intelligence has made important strides in bridging the
difference between human and computer capabilities.
Researchers and enthusiasts alike operate in a number of facets
of the area to produce impressive performance. The field of
computer vision is one of several such fields. The goal of this
area is to allow machines to see and understand the environment
in the same way that humans do, and to use that information for
picture and video identification, image interpretation and
labeling, media recreation, recommendation systems, natural
language processing, and other functions are only a few
examples.
In this paper, we used three deep learning methods for face
mask detection, including Max pooling, Average pooling, and
MobileNetV2 architecture to detect the face mask. In Fig. 1 we
have displayed the entire proposed system diagram.
Fig. 1. Proposed model diagram.
A. Data Collection:
For mask detection, we used three different datasets with a
total of 1340 photographs. Using mobile cameras, webcams, and
CCTV video, another 120 photographs were taken. For detecting
masks from video used CCTV footage and Webcam, both of the
photos are in RGB. To avoid overfitting, we collected data from
different datasets and generated our datasets, the Real-World
Masked Face Dataset (RMFD) [28] and the Simulated Masked
Face Dataset (SMFD) [29], which we used for training and
testing purpose.
Fig. 2. Datasets images Samples.
B. Preprocessing and Augmentation of Data:
The images in the dataset are not all the same size, so
preprocessing was required for this study. The training of deep
learning models necessarily requires a large amount of data. We
used Keras' Image Data Generator method to resize all of the
images to 256 × 256 pixels. We normalized all images after
converting them to 256 × 256. For faster calculation, images are
converted to NumPy arrays. Increase the amount of data by
rotating, zooming, shearing, and horizontal flipping. Images are
gathered as well. The images are then resized to 128 x 128 for
passing through the second convolution layer, and then to 64 x
64 for passing through the third convolution layer.
C. Proposed Convolution Neural Network(CNN)
architecture:
For classification and image processing, CNN is used. CNN
consists of one or more convolution layers. CNN aims to find
features that are effective inside an image rather than working
with an entire image. There are several secret layers in CNN, as
well as an input layer and an output layer. In this research, we
have applied deep CNN with 3 convolution layers. Convolution
helps to get a new function by combining two mathematical
functions. Max pooling is a discretization method dependent on
samples. The aim is to reduce the complexity of an input
representation, enabling decisions to be made regarding features
found in the binned sub-regions. Our CNN model's working
process with Max pooling is depicted in Fig. 3.
Fig. 3. Three Convolution Layer with Max pooling operation.
This time, the same architecture is used for function
mapping, but with an average pooling process. The model's
activity is shown in Fig. 4. Average pooling takes the average of
all values within the picture matrix's area of interest, while Max
pooling takes the largest amount within that region. Our CNN
model initiates with Keras. Models. sequential (). In the first
hidden layer, the Relu activation feature is used, preceded by the
Max pooling process. Max pooling helps to gather significant
information and reduces the size of the images. After that, the
data is passed to the second convolution layer. Maximum
pooling is used once more to obtain the most notable
information. The obtained image matrix is then flattened and
trained. After that, the image matrix is flattened and trained.
Instead of using the Max pooling operation to observe the
model's performance, we used the Average pooling operation.
For more accurate training, Adam stochastic gradient descent
algorithms were used. We use 80% of our dataset's images for
training.
Fig. 4. Three Convolution Layer with Average pooling operation.
D. MobileNetV2 Architecture:
MobileNetV2 is a powerful image classification tool.
TensorFlow provides the image weights in MobileNetV2, a
lightweight CNN-based deep learning model. First, the
MobileNetV2 base layer is removed, and a new trainable layer
is added. The model analyzes the data and extracts the most
relevant features from our images. There are 19 bottleneck
layers in MobileNetV2 [30]. In the base model, we used
OpenCV, which is based on the ResNet-10 architecture [30]. To
detect the face and mask from an image and a video stream,
OpenCV's Caffemodel is used. The mask detecting classifier
receives the output face detected image. It allows for faster and
more accurate detection of masks in video streaming. In machine
learning, overfitting is a major problem. The Dropout layer was
used to ignore our model being overfitted with the dataset. Using
MobileNetV2 (include top=False), we were able to get rid of the
base layer. The pictures have been resized. The average pooling
operation is used with a pool size of 128 hidden layers in our
trainable model (7,7). In the secret layer, the Relu activation
function is used, and in the entire linked layer, the SoftMax
activation function is used. For better accuracy, we set a learning
rate of 0.01. The Adam stochastic gradient descent algorithm
aids in the model's comprehension of picture characteristics.
MobileNetV2 working layer depicted in Fig. 5.
Fig. 5. MobileNetV2 Architecture.
E. Evaluating performance using performance matrix:
We measured the performance of two models using
precision, recall, f1-score, and accuracy after completing the
training and testing phase. The formulas that we used are as
follows:
  
 (1)
  
 (2)
  
 (3)
   
 (4)
III. EXPERIMENT RESULT ANALYSIS
We used two datasets to detect masks from images: 1845
images from various sources and 120 co-author's photos taken
with a webcam and a mobile phone camera. The training and
validation accuracy after using the Deep CNN [31] model with
Max Pooling to reduce the dimension of our image feature map
is shown in Table I. The highest accuracy is 96.49% in training
data and 98.67% in validation data set.
TABLE I. OUTCOMES FOR DEEP CNN AFTER APPLYING MAX POOLING OF
DIFFERENT EPOCHS
Epoch
Training
Loss
Training
Accuracy
Validation
Loss
Validation
Accuracy
1
42.13%
89.76%
12.32%
90.73%
2
10.01%
91.87%
8.43%
94.34%
3
8.45%
93.97%
7.33%
96.10%
4
8.21%
94.53%
7.25%
96.25%
5
7.04%
94.98%
7.10%
97.03%
6
6.90%
95.12%
6.35%
97.23%
7
6.83%
95.24%
6.12%
97.54%
8
6.56%
95.65%
6.01%
97.71%
5.99%
95.89%
4.88%
97.92%
5.83%
96.07%
4.76%
98.12%
5.72%
96.45%
4.65%
98.36%
5.12%
96.48%
4.23%
98.43%
5.05%
96.49%
4.12%
98.67%
The training accuracy and validation accuracy graphs are
shown in Fig. 6. Later on, the same CNN architecture is applied
later where Average Pooling is used to reduce the dimensions of
the feature map. Compared to the previous one, the expected
outcome is less accurate. The estimated outcomes as seen in
Table II, with a maximum training accuracy of 95.19% and a
training loss of 5.92%, and a validation accuracy of 96.23%.
Fig. 6.Test Accuracy and Training Accuracy for CNN with Max Pooling
Layer
TABLE II. OUTCOMES FOR DEEP CNN AFTER APPLYING AVERAGE POOLING
OF DIFFERENT EPOCHS
Training
Loss
Training
Accuracy
Validation
Loss
Validation
Accuracy
43.54%
88.92%
13.32%
89.95%
11.80%
90.21%
9.43%
90.12%
10.99%
90.85%
9.33%
91.01%
9.82%
91.06%
8.25%
91.52%
9.21%
91.24%
8.10%
93.25%
8.95%
92.37%
8.35%
93.54%
8.71%
92.69%
7.12%
94.21%
8.12%
94.01%
7.01%
94.39%
7.10%
94.29%
7.88%
95.11%
6.75%
94.65%
6.76%
95.15%
6.62%
94.82%
6.65%
95.20%
6.32%
95.12%
6.23%
96.12%
5.92%
95.19%
5.12%
96.23%
For each epoch, Fig. 7 depicts a graph of relative validation
and training accuracy.
Fig. 7.Test Accuracy and Training Accuracy for CNN with Average
Pooling Layer.
The accuracy improved significantly by using the
MobileNetV2 architecture. For each epoch, Table III shows the
validation and test accuracy.
TABLE III. DIFFERENT OUTCOMES AFTER APPLYING MOBILENETV2
ARCHITECTURE.
Epoch
Training
Loss
Training
Accuracy
Validation
Loss
Validation
Accuracy
1
4.43%
98.67%
4.21%
98.71%
2
4.32%
98.72%
4.12%
98.81%
3
4.21%
98.81%
4.09%
98.92%
4
4.12%
98.92%
3.89%
99.10%
5
3.90%
98.99%
3.72%
99.13%
6
3.82%
99.01%
3.61%
99.25%
7
3.78%
99.13%
3.56%
99.32%
8
3.65%
99.24%
3.41%
99.37%
9
3.61%
99.32%
3.23%
99.47%
10
3.54%
99.51%
3.20%
99.65%
11
3.46%
99.63%
3.18%
99.82%
12
3.42%
99.72%
3.18%
99.82%
13
3.42%
99.72%
3.18%
99.82%
The best precision is 99.72% for training data and 99.82
percent for validity data, according to Table III. Just 3.18% of
data lost during the validation process. Fig. 5 shows the detailed
comparison of test accuracy and validation accuracy of
MobilenetV2 which is a CNN-based architecture. After using
the MobilenetV2 architecture, we measured the confusion
matrix. The confusion matrix is correctly depicted in Fig. 8.
Fig. 8.Confusion Matrix after applying MobilenetV2.
The MobilenetV2 design outperformed many of the other
models included in this study. This model is capable of
recognizing the mask in a picture. In Fig.9, 10, and 11 showing
the detection result of MobileNetV2.
Fig. 9.Detection of No Mask from an image.
Fig. 10.Detection of Mask from an image.
MobilenetV2 can successfully identify the mask from video
streams with proper accuracy.
Fig. 11. Detection of the mask from video streams using MobilenetV2.
The Max pooling achieved 96.49% training accuracy and
validation accuracy is 98.67%. Besides, the Average pooling
achieved 95.19% training accuracy and validation accuracy is
96.23%. MobileNetV2 architecture gained the highest accuracy
99.72% for training and 99.82% for validation. A short
explanation is added in Table IV.
TABLE IV. COMPARISON WITHIN THE CNN TECHNIQUES
Training
Loss
Training
Accuracy
Validation
Loss
Validation
Accuracy
Max Pooling
5.05%
96.49%
4.12%
98.67%
Average
Pooling
5.92%
95.19%
5.12%
96.23%
MobileNetV2
3.42%
99.72%
3.18%
99.82%
IV. CONCLUSION AND FUTURE WORK
We used two deep CNN architectures and one CNN-based
MobilenetV2 architecture in this study. Our primary objective
was to propose a compatible model with high accuracy such that
mask identification will be simple throughout the pandemic. In
order to assess performance with a wider dataset, we can attempt
to add further models to compare with Mobilenetv2 and tried to
integrate this model with IoT [32-35] to detect humans without
masks automatically.
REFERENCES
[1] WHO EMRO | About COVID-19 | COVID-19 | Health topics.
[Online]. Available: http://www.emro.who.int/health-
topics/corona-virus/about-covid-19.html, accessed on: Jul. 26,
2020.
[2] H. Lau et al., “Internationally lost COVID-19 cases,” J. Microbiol.
Immunol. Infect., vol. 53, no. 3, pp. 454458, 2020.
[3] Worldometer, “Coronavirus Cases,”. [Online]. Available:
https://www.worldometers.info/coronavirus, accessed on: Jul. 26,
2020.
[4] L. Li et al., “COVID-19 patients’ clinical characteristics, discharge
rate, and fatality rate of meta-analysis,” J. Med. Virol., vol. 92, no.
6, pp. 577583, Jun. 2020.
[5] P. Ghosh et al., "Efficient Prediction of Cardiovascular Disease
Using Machine Learning Algorithms With Relief and LASSO
Feature Selection Techniques," in IEEE Access, vol. 9, pp. 19304-
19326, 2021, doi: 10.1109/ACCESS.2021.3053759.
[6] F.M. Javed Mehedi Shamrat, Md. Asaduzzaman, A.K.M. Sazzadur
Rahman, Raja Tariqul Hasan Tusher, Zarrin Tasnim “A
Comparative Analysis of Parkinson Disease Prediction Using
Machine Learning Approaches” International Journal of Scientific
& Technology Research, Volume 8, Issue 11, November 2019,
ISSN: 2277-8616, pp: 2576-2580.
[7] A.K.M Sazzadur Rahman, F. M. Javed Mehedi Shamrat, Zarrin
Tasnim, Joy Roy, Syed Akhter Hossain “A Comparative Study on
Liver Disease Prediction Using Supervised Machine Learning
Algorithms” International Journal of Scientific & Technology
Research, Volume 8, Issue 11, November 2019, ISSN: 2277-8616,
pp: 419-422.
[8] F. M. Javed Mehedi Shamrat, Md. Abu Raihan, A.K.M. Sazzadur
Rahman, Imran Mahmud, Rozina Akter, “An Analysis on Breast
Disease Prediction Using Machine Learning Approaches”
International Journal of Scientific & Technology Research, Volume
9, Issue 02, February 2020, ISSN: 2277-8616, pp: 2450-2455.
[9] F. M. Javed Mehedi Shamrat, Zarrin Tasnim, Imran Mahmud, Ms.
Nusrat Jahan, Naimul Islam Nobel, “Application Of K-Means
Clustering Algorithm To Determine The Density Of Demand Of
Different Kinds Of Jobs”, International Journal of Scientific &
Technology Research, Volume 9, Issue 02, February 2020, ISSN:
2277-8616, pp: 2550-2557.
[10] M. Jiang, X. Fan, and H. Yan, “RetinaMask: A Face Mask detector,”
2020. [Online]. Available: http://arxiv.org/abs/2005.03950.
[11] P. Ghosh, S. Azam, A. Karim, M. Jonkman, MDZ Hasan, “Use of
Efficient Machine Learning Techniques in the Identification of
Patients with Heart Diseases,” 5th ACM International Conference
on Information System and Data Mining (ICISDM2021), 2021.
[12] M. S. Junayed, A. A. Jeny, S. T. Atik, N. Neehal, A. Karim, S.
Azam, and B. Shanmugam, “AcneNet - A Deep CNN Based
Classification Approach for Acne Classes,” 2019 12th International
Conference on Information & Communication Technology and
System (ICTS), 2019.
[13] A. Karim, P. Ghosh, A. A. Anjum, M. S. Junayed, Z. H. Md, K. M.
Hasib, and A. N. Bin Emran, “A Comparative Study of Different
Deep Learning Model for Recognition of Handwriting Digits,”
SSRN Electronic Journal, 2021.
[14] M. Al Karim, A. Karim, S. Azam, E. Ahmed, F. De Boer, A. Islam,
and F. N. Nur, “Cognitive Learning Environment and Classroom
Analytics (CLECA): A Method Based on Dynamic Data Mining
Techniques,” Innovative Data Communication Technologies and
Application, pp. 787797, 2021.
[15] Chen, Joy Iong Zong, and S. Smys. Social Multimedia Security and
Suspicious Activity Detection in SDN using Hybrid Deep Learning
Technique. Journal of Information Technology 2, no. 02 (2020):
108-115.
[16] Smys, S., Joy Iong Zong Chen, and Subarna Shakya. Survey on
Neural Network Architectures with Deep Learning. Journal of Soft
Computing Paradigm (JSCP) 2, no. 03 (2020): 186-194.
[17] M. Gupta, M. Abdelsalam, and S. Mittal, “Enabling and Enforcing
Social Distancing Measures using Smart City and ITS
Infrastructures: A COVID-19 Use Case,” 2020. [Online]. Available:
https://arxiv.org/abs/2004.09246.
[18] J. Won Sonn and J. K. Lee, “The smart city as time-space
cartographer in COVID-19 control: the South Korean strategy and
democratic control of surveillance technology,” Eurasian Geogr.
Econ., pp. 111, May. 2020.
[19] Loey M, Manogaran G, Taha MHN, Khalifa NEM. A hybrid deep
transfer learning model with machine learning methods for face
mask detection in the era of the COVID-19 pandemic. Measurement
: Journal of the International Measurement Confederation. 2021
Jan;167:108288. DOI: 10.1016/j.measurement.2020.108288.
[20] F. M. Javed Mehedi Shamrat, P. Ghosh, M. H. Sadek, M. A. Kazi
and S. Shultana, "Implementation of Machine Learning Algorithms
to Detect the Prognosis Rate of Kidney Disease," 2020 IEEE
International Conference for Innovation in Technology (INOCON),
Bangluru, India, 2020, pp. 1-7, doi:
10.1109/INOCON50539.2020.9298026.
[21] P. Ghosh, F. M. Javed Mehedi Shamrat, S. Shultana, S. Afrin, A. A.
Anjum and A. A. Khan, "Optimization of Prediction Method of
Chronic Kidney Disease Using Machine Learning Algorithm," 2020
15th International Joint Symposium on Artificial Intelligence and
Natural Language Processing (iSAI-NLP), Bangkok, Thailand,
2020, pp. 1-6, doi: 10.1109/iSAI-NLP51646.2020.9376787.
[22] F. M. Javed Mehedi Shamrat, Z. Tasnim, P. Ghosh, A. Majumder
and M. Z. Hasan, "Personalization of Job Circular Announcement
to Applicants Using Decision Tree Classification Algorithm," 2020
IEEE International Conference for Innovation in Technology
(INOCON), Bangluru, India, 2020, pp. 1-5, doi:
10.1109/INOCON50539.2020.9298253.
[23] M. M. Rahman, M. M. H. Manik, M. M. Islam, S. Mahmud and J. -
H. Kim, "An Automated System to Limit COVID-19 Using Facial
Mask Detection in Smart City Network," 2020 IEEE International
IOT, Electronics and Mechatronics Conference (IEMTRONICS),
Vancouver, BC, Canada, 2020, pp. 1-5, doi:
10.1109/IEMTRONICS51293.2020.9216386.
[24] T. Meenpal, A. Balakrishnan and A. Verma, "Facial Mask Detection
using Semantic Segmentation," 2019 4th International Conference
on Computing, Communications and Security (ICCCS), Rome,
Italy, 2019, pp. 1-5, doi: 10.1109/CCCS.2019.8888092.
[25] F. M. Javed Mehedi Shamrat, Imran Mahmud, A.K.M Sazzadur
Rahman, Anup Majumder, Zarrin Tasnim, Naimul Islam Nobel,“A
Smart Automated System Model For Vehicles Detection To
Maintain Traffic By Image Processing” International Journal of
Scientific & Technology Research, Volume 9, Issue 02, February
2020, ISSN: 2277-8616, pp: 2921-2928.
[26] A. Islam Chowdhury, M. Munem Shahriar, A. Islam, E. Ahmed, A.
Karim, and M. Rezwanul Islam, “An Automated System in ATM
Booth Using Face Encoding and Emotion Recognition Process,”
2020 2nd International Conference on Image Processing and
Machine Vision, 2020.
[27] M.S. Ejaz, M.R. Islam, M. Sifatullah, A. SarkerImplementation of
principal component analysis on masked and non-masked face
recognition 2019 1st International Conference on Advances in
Science, Engineering and Robotics Technology (ICASERT) (2019),
pp. 15, 10.1109/ICASERT.2019.8934543
[28] https://github.com/X-zhangyang/Real-World-Masked-Face-
Dataset.
[29] https://www.kaggle.com/omkargurav/face-mask-dataset
[30] An automated System to limit covid 19 using facial mask detection
in smart city network( 2020, IEEE)
https://ieeexplore.ieee.org/document/9216386.
[31] Junayed M.S., Jeny A.A., Neehal N., Atik S.T., Hossain S.A. (2019)
A Comparative Study of Different CNN Models in City Detection
Using Landmark Images. In: Santosh K., Hegadi R. (eds) Recent
Trends in Image Processing and Pattern Recognition. RTIP2R 2018.
Communications in Computer and Information Science, vol 1035.
Springer, Singapore. https://doi.org/10.1007/978-981-13-9181-
1_48
[32] Javed Mehedi Shamrat F.M., Allayear S.M., Alam M.F., Jabiullah
M.I., Ahmed R. (2019) A Smart Embedded System Model for the
AC Automation with Temperature Prediction. In: Singh M., Gupta
P., Tyagi V., Flusser J., Ören T., Kashyap R. (eds) Advances in
Computing and Data Sciences. ICACDS 2019. Communications in
Computer and Information Science, vol 1046. Springer, Singapore.
https://doi.org/10.1007/978-981-13-9942-8_33.
[33] Shamrat F.M.J.M., Nobel N.I., Tasnim Z., Ahmed R. (2020)
Implementation of a Smart Embedded System for Passenger Vessel
Safety. In: Saha A., Kar N., Deb S. (eds) Advances in
Computational Intelligence, Security and Internet of Things.
ICCISIoT 2019. Communications in Computer and Information
Science, vol 1192. Springer, Singapore.
https://doi.org/10.1007/978-981-15-3666-3_29.
[34] F. M. Javed Mehedi Shamrat, Zarrin Tasnim, Naimul Islam Nobel,
and Md. Razu Ahmed. 2019. An Automated Embedded Detection
and Alarm System for Preventing Accidents of Passengers Vessel
due to Overweight. In Proceedings of the 4th International
Conference on Big Data and Internet of Things (BDIoT'19).
Association for Computing Machinery, New York, NY, USA,
Article 35, 15. DOI:https://doi.org/10.1145/3372938.3372973.
[35] F.M. Javed Mehedi Shamrat, Shaikh Muhammad Allayear and Md.
Ismail Jabiullah "Implementation of a Smart AC Automation
System with Room Temperature Prediction", Journal of the
Bangladesh Electronic Society, Volume 18, Issue 1-2, June-
December 2018, ISSN: 1816-1510, pp: 23-32.
... SA Sanjaya et al. [17] used MobileNetV2 to detect face masks in 25 different cities, with an accuracy rate of 96.85 percent. Some methods are detected face masks [18], including max pooling, average pooling and mobileNetV2 architecture. Relatively speaking, some of the existing network models are not suitable for deployment in real-time conditions and are also not suitable for using in embedded devices. ...
Article
Full-text available
To reduce the chance of being infected by the COVID-19, wearing masks correctly when entering and leaving public places has become the most feasible and effective ways to prevent the spread of the virus. It is a concern to how to quickly and accurately detect whether a face is worn a mask correctly while reduce missed detection and false detection in practical applied scenarios. In this paper, an improved algorithm is proposed based on the YOLO-v4 algorithm. The attention mechanism module is added to the appropriate network level to enhance the key feature points of face wearing masks and suppress useless information. Apart from that, three attention mechanism modules are added to different layers of the YOLO-v4 network for ablation experiments, including CBAM (convolutional block attention module), SENet (squeeze-and-excitation networks) and CANet (coordinate attention networks). The path-aggregation network and feature pyramid are used to extract features from images. Two network models were compared and improved in the experiment, and it is found that adding the dual-channel attention mechanism CBAM before the three YOLO heads of YOLOv4 and in the neck network had better detection performance than the single channel attention mechanism SENet and the coordinated attention mechanism CANet. The experimental results show that when the attention module CBAM and the YOLO-v4 model are integrated, the accuracy of the selected MAFA + WIDER Face dataset reaches the highest value of 93.56%, which is 4.66% higher than that of the original YOLO-v4.
... By applying convolutional filters directly to input images, CNNs efficiently isolate high-level features, enhancing both the accuracy and computational speed for tasks like image classification and object detection. FMJM Shamrat et al. [59] exploring three deep learning techniques for face mask recognition: Max pooling, Average pooling, and MobileNetV2. MobileNetV2 achieved the highest accuracies-99.72% in training and 99.82% in validation-demonstrating a robust capability, while H Goyal et al. [60] developed an automated face mask recognition model to enforce mask wearing in public spaces. ...
Article
Full-text available
Masked face recognition (MFR) has emerged as a critical domain in biometric identification, especially with the global COVID-19 pandemic, which introduced widespread face masks. This survey paper presents a comprehensive analysis of the challenges and advancements in recognizing and detecting individuals with masked faces, which has seen innovative shifts due to the necessity of adapting to new societal norms. Advanced through deep learning techniques, MFR, along with face mask recognition (FMR) and face unmasking (FU), represents significant areas of focus. These methods address unique challenges posed by obscured facial features, from fully to partially covered faces. Our comprehensive review explores the various deep learning-based methodologies developed for MFR, FMR, and FU, highlighting their distinctive challenges and the solutions proposed to overcome them. Additionally, we explore benchmark datasets and evaluation metrics specifically tailored for assessing performance in MFR research. The survey also discusses the substantial obstacles still facing researchers in this field and proposes future directions for the ongoing development of more robust and effective masked face recognition systems. This paper serves as an invaluable resource for researchers and practitioners, offering insights into the evolving landscape of face recognition technologies in the face of global health crises and beyond.
... Mask-wearing has become standard practice for preventing the spread of airborne viruses (e.g., influenza). There has been some advancement in mask detection using images [38] and audio [14]. Notably, the prior audio-based method requires a full spectrogram and cannot be applied to our system. ...
Article
Full-text available
The types of human activities occupants are engaged in within indoor spaces significantly contribute to the spread of airborne diseases through emitting aerosol particles. Today, ubiquitous computing technologies can inform users of common atmosphere pollutants for indoor air quality. However, they remain uninformed of the rate of aerosol generated directly from human respiratory activities, a fundamental parameter impacting the risk of airborne transmission. In this paper, we present AeroSense, a novel privacy-preserving approach using audio sensing to accurately predict the rate of aerosol generated from detecting the kinds of human respiratory activities and determining the loudness of these activities. Our system adopts a privacy-first as a key design choice; thus, it only extracts audio features that cannot be reconstructed into human audible signals using two omnidirectional microphone arrays. We employ a combination of binary classifiers using the Random Forest algorithm to detect simultaneous occurrences of activities with an average recall of 85%. It determines the level of all detected activities by estimating the distance between the microphone and the activity source. This level estimation technique yields an average of 7.74% error. Additionally, we developed a lightweight mask detection classifier to detect mask-wearing, which yields a recall score of 75%. These intermediary outputs are critical predictors needed for AeroSense to estimate the amounts of aerosol generated from an active human source. Our model to predict aerosol is a Random Forest regression model, which yields 2.34 MSE and 0.73 r2 value. We demonstrate the accuracy of AeroSense by validating our results in a cleanroom setup and using advanced microbiological technology. We present results on the efficacy of AeroSense in natural settings through controlled and in-the-wild experiments. The ability to estimate aerosol emissions from detected human activities is part of a more extensive indoor air system integration, which can capture the rate of aerosol dissipation and inform users of airborne transmission risks in real time.
... Keunggulan CNN adalah dapat menemukan elemen yang berguna di dalam gambar yang lebih efektif daripada dengan seluruh gambar, yang menghasilkan proses deteksi yang lebih akurat (Mehedi Shamrat et al., 2021 ...
Article
Full-text available
Dalam forensik dan keamanan, penting untuk menentukan jenis kelamin seseorang. Identifikasi jenis kelamin dengan menggunakan beberapa jenis identifikasi, seperti gambar wajah, suara, atau tulisan tangan, telah dipelajari secara ekstensif dalam beberapa tahun terakhir. Namun banyak pelaku kejahatan yang sulit dikenali dalam rekaman CCTV karena mereka menutupi kepala mereka atau mengenakan topeng yang hanya menunjukkan bentuk mata tertentu. Pada artikel ini, kami mengeksplorasi penggunaan CNN dengan aktivasi Relu untuk setiap lapisan tersembunyi dan Algoritma Haar Cascade Classifier untuk mendeteksi objek mata manusia untuk mengenali mata manusia menggunakan deep learning. Sebanyak 11.525 gambar mata pria dan wanita menggunakan dataset publik yang diambil dari Kaggle digunakan sebagai data penelitian. Memanfaatkan optimasi Adam (Adaptive Moment Estimation), prosedur pelatihan berlangsung selama 20 epoch. Temuan penelitian ini memiliki tingkat akurasi 92% untuk mengidentifikasi jenis kelamin secara otomatis. Matriks evaluasi kinerja digunakan dalam investigasi ini, dan menghasilkan F1-Score keseluruhan sebesar 93%.
Chapter
The COVID-19 causes global pandemic which affects human health. As per the report, COVID-19 patients were suffering from reasonable symptoms and recuperate without further care. Many people who are affected need proper medical support. For the effective protection, the best method is using a face mask. Here in this paper, we are trying to predict the best algorithm for a masked face. Fig. 1 shows the blueprint of how the system works. This model proposes two components. Feature extraction using CNN is used as the first component and identifies the masked and non-masked face, and random forest, SVM, Naïve Bayes, decision tree and KNN are used for the classification process. Exponential findings suggest that random forest produces more accurate results on the dataset we developed, which consists of roughly 80 photos with 98.24% accuracy.
Conference Paper
Full-text available
Cardiovascular disease has become one of the world's major causes of death. Accurate and timely diagnosis is of crucial importance. We constructed an intelligent diagnostic framework for prediction of heart disease, using the Cleveland Heart disease dataset. We have used three machine learning approaches, Decision Tree (DT), K-Nearest Neighbor (KNN), and Random Forest (RF) in combination with different sets of features. We have applied the three techniques to the full set of features, to a set of ten features selected by "Pear-son's Correlation" technique and to a set of six features selected by the Relief algorithm. Results were evaluated based on accuracy, precision , sensitivity, and several other indices. The best results were obtained with the combination of the RF classifier and the features selected by Relief achieving an accuracy of 98.36%. This could even further be improved by employing a 5-fold Cross Validation (CV) approach, resulting in an accuracy of 99.337%.
Chapter
Full-text available
With the advent of modern data analytics tools, understanding the bits and pieces of any environment with the abundance of relevant data has become a reality. Traditional post event analyses are evolving toward on-line and real-time processes. Along with versatile algorithms are being proposed to address the data types suitable for dynamic environments. This research would investigate different dynamic data mining methods that can be deployed into a modern classroom to assist both the teaching and learning atmosphere based on the past and present data. Time series data regarding student’s attentiveness, academic history, content of the topic, demography of the classroom and human sentiment analysis would be fed into an algorithm suitable for dynamic operations to make the learning ambience smarter, resulting in better information being available to educators to take most appropriate measures while teaching a topic. The research objective is to propose an algorithm that can later be implemented with proper hardware set-up.
Article
Full-text available
Cardiovascular diseases are among the most common serious illnesses affecting human health. CVDs may be prevented or mitigated by early diagnosis, and this may reduce mortality rates. Identifying risk factors using machine learning models is a promising approach. We would like to propose a model that incorporates different methods to achieve effective prediction of heart disease. For our proposed model to be successful, we have used efficient Data Collection, Data Pre-processing and Data Transformation methods to create accurate information for the training model. We have used a combined dataset (Cleveland, Long Beach VA, Switzerland, Hungarian and Stat log). Suitable features are selected by using the Relief, and Least Absolute Shrinkage and Selection Operator (LASSO) techniques. New hybrid classifiers like Decision Tree Bagging Method (DTBM), Random Forest Bagging Method (RFBM), K-Nearest Neighbors Bagging Method (KNNBM), AdaBoost Boosting Method (ABBM), and Gradient Boosting Boosting Method (GBBM) are developed by integrating the traditional classifiers with bagging and boosting methods, which are used in the training process. We have also instrumented some machine learning algorithms to calculate the Accuracy (ACC), Sensitivity (SEN), Error Rate, Precision (PRE) and F1 Score (F1) of our model, along with the Negative Predictive Value (NPR), False Positive Rate (FPR), and False Negative Rate (FNR). The results are shown separately to provide comparisons. Based on the result analysis, we can conclude that our proposed model produced the highest accuracy while using RFBM and Relief feature selection methods (99.05%).
Article
Full-text available
With the expansion of Artificial Neural Network (ANN), Deep Learning (DL) has brought interesting turn in the various fields of Artificial Intelligence (AI) by making it smarter and more efficient than what we had even in 10-2 years back. DL has been in use in various fields due to its versatility. Convolutional Neural Network (CNN) is at the major point of advancement that brings together the ANN and innovative DL techniques. In this research paper, we have contrived a multi-layer, fully connected neural network (NN) with 10 and 12 hidden layers for handwritten digits (HD) recognition. The testing is performed on the publicly attainable MNIST handwritten database. We selected 60,000 images from the MNIST database for training, and 10,000 images for testing. Our multi-layers ANN (10), ANN (12) and CNN are able to achieve an overall accuracy of 99.10%, 99. 34% and 99.70% respectively while determining digits using the MNIST handwriting dataset.
Conference Paper
Full-text available
The chronic kidney disease is the loss of kidney function. Often time, the symptoms of the disease is not noticeable and a significant amount of lives are lost annually due to the disease. Using machine learning algorithm for medical studies, the disease can be predicted with a high accuracy rate and a very short time. Using four of the supervised classification learning algorithms, i.e., logistic regression, Decision tree, Random Forest and KNN algorithms, the prediction of the disease can be done. In the paper, the performance of the predictions of the algorithms are analyzed using a pre-processed dataset. The performance analysis is done base on the accuracy of the results, prediction time, ROC and AUC Curve and error rate. The comparison of the algorithms will suggest which algorithm is best fit for predicting the chronic kidney disease.
Preprint
Full-text available
Chronic Kidney disease (CKD), a slow and late-diagnosed disease, is one of the most important problems of mortality rate in the medical sector nowadays. Based on this critical issue, a significant number of men and women are now suffering due to the lack of early screening systems and appropriate care each year. However, patients' lives can be saved with the fast detection of disease in the earliest stage. In addition, the evaluation process of machine learning algorithm can detect the stage of this deadly disease much quicker with a reliable dataset. In this paper, the overall study has been implemented based on four reliable approaches, such as Support Vector Machine (henceforth SVM), AdaBoost (henceforth AB), Linear Discriminant Analysis (henceforth LDA), and Gradient Boosting (henceforth GB) to get highly accurate results of prediction. These algorithms are implemented on an online dataset of UCI machine learning repository. The highest predictable accuracy is obtained from Gradient Boosting (GB) Classifiers which is about to 99.80% accuracy. Later, different performance evaluation metrics have also been displayed to show appropriate outcomes. To end with, the most efficient and optimized algorithms for the proposed job can be selected depending on these benchmarks.
Conference Paper
Full-text available
COVID-19 pandemic caused by novel coronavirus is continuously spreading until now all over the world. The impact of COVID-19 has been fallen on almost all sectors of development. The healthcare system is going through a crisis. Many precautionary measures have been taken to reduce the spread of this disease where wearing a mask is one of them. In this paper, we propose a system that restrict the growth of COVID-19 by finding out people who are not wearing any facial mask in a smart city network where all the public places are monitored with Closed-Circuit Television (CCTV) cameras. While a person without a mask is detected, the corresponding authority is informed through the city network. A deep learning architecture is trained on a dataset that consists of images of people with and without masks collected from various sources. The trained architecture achieved 98.7% accuracy on distinguishing people with and without a facial mask for previously unseen test data. It is hoped that our study would be a useful tool to reduce the spread of this communicable disease for many countries in the world.
Conference Paper
Full-text available
Nowadays, the banking transaction system is more flexible than the previous one. When the banking sector introduces the ATM booth to us, it was a step ahead to ease the human effort. Here, ATM booth is an automated teller machine that gives out money to the consumer by inserting a card in it. All ATM booths support both credit and debit cards for the transaction, and this has saved everyone’s time. But still, there are some certain situations, i.e., forgetting the card authentication details for a transaction can ruin a consumer's day. For this reason, this paper has tried to propose a system that will help everyone regarding this situation. This proposed system is about face encoding process with an emotion recognition test for making transactions faster and accurate, based on Convolutional Neural Network (CNN). However, normal card transactions can still be possible besides using the proposed system. FER2013 dataset was used for training and then tested the model using our own sample images. The result shows that the proposed system can correctly separate ‘Happy’ faces from other emotional faces and allow the transaction to proceed.
Article
While the US, UK, France, Italy, and many other countries ended up implementing complete lockdown after tens of thousands of deaths from COVID-19, South Korea kept factories and offices running, flattened the curve, and maintained a low mortality rate. Extensive media coverage has focused on South Korea’s testing capacity as the primary reason, but there has been little discussion of the vital role of the smart city. In this paper, we describe how smart city technologies form a crucial part of disease control in South Korea, explain the social conditions for the extensive use of smart city technology, and offer critical insights into contemporary discussions on the issue of smart cities and surveillance.
Article
In the present research era, machine learning is an important and unavoidable zone where it provides better solutions to various domains. In particular deep learning is one of the cost efficient, effective supervised learning model, which can be applied to various complicated issues. Since deep learning has various illustrative features and it doesn’t depend on any limited learning methods which helps to obtain better solutions. As deep learning has significant performance and advancements it is widely used in various applications like image classification, face recognition, visual recognition, language processing, speech recognition, object detection and various science, business analysis, etc., This survey work mainly provides an insight about deep learning through an intensive analysis of deep learning architectures and its characteristics along with its limitations. Also, this research work analyses recent trends in deep learning through various literatures to explore the present evolution in deep learning models.