Conference PaperPDF Available

Abstract and Figures

Data augmentation is a common technique to increase dataset. This study proposes a method that generates image data of empty road dataset. Data-driven approach is a popular approach to training an intelligent model. Various existing researches or new researches that require the empty road to train their data-driven model will be benefited by using this method. The method uses image inpainting to remove vehicles from input images. The model detects the vehicle using Mask RCNN and using image inpainting the model removes the detected object. To increase the efficiency of the method morphological transformation was used. The morphological transformation is also adjusted depending on the output and may need several iterations to ensure better results. The experiment results uphold the efficiency of the method.
Content may be subject to copyright.
Data Augmentation Technique to Expand Road
Dataset Using Mask RCNN and Image Inpainting
Plabon Kumar Saha
Department of Computer Science and
Engineering
American International University-
Bangladesh
Dhaka-1229, Bangladesh
pkumarsaha71@gmail.com
Sinthia Ahmed
Department of Computer Science and
Engineering
American International University-
Bangladesh
Dhaka-1229, Bangladesh
ahmed.sinthia97@gmail.com
Tajbiul Ahmed
Department of Computer Science and
Engineering
American International University-
Bangladesh
Dhaka-1229, Bangladesh
ahmedtajbiul@gmail.com
Hasidul Islam
Department of Computer Science and
Engineering
American International University-
Bangladesh
Dhaka-1229, Bangladesh
ihasidul@gmail.com
Al Imran
Department of Computer Science and
Engineering
American International University-
Bangladesh
Dhaka-1229, Bangladesh
asq24i@gmail.com
A. Z. M. Tahmidul Kabir
Department of Electrical and Electronic
Engineering
American International University-
Bangladesh
Dhaka-1229, Bangladesh
tahmidulkabir@gmail.com
Al Mamun Mizan
Department of Electrical and Electronic
Engineering
American International University-
Bangladesh
Dhaka-1229, Bangladesh
almamuneee15@gmail.com
Abstract Data augmentation is a common technique to
increase dataset. This study proposes a method that generates
image data of empty road dataset. Data-driven approach is a
popular approach to training an intelligent model. Various
existing researches or new researches that require the empty
road to train their data-driven model will be benefited by using
this method. The method uses image inpainting to remove
vehicles from input images. The model detects the vehicle
using Mask RCNN and using image inpainting the model
removes the detected object. To increase the efficiency of the
method morphological transformation was used. The
morphological transformation is also adjusted depending on
the output and may need several iterations to ensure better
results. The experiment results uphold the efficiency of the
method.
KeywordsMask RCNN, Image inpainting, Data
augmentation, Image processing, Object detection.
I. INTRODUCTION
Over the years, artificial intelligence has expanded its
research areas with innovative ideas such as autonomous
vehicles, virtual environment construction, and urban city
planning. Throughout different research domains, many
approaches are being used, one of which is the data-driven
approach [1]. There are many research domains that rely on
data for practical implementations like IoT [2], Image
processing [3], [4], Machine learning etc. Augmented
autonomous driving simulation using a data-driven approach
[1], with the data being real-world road images collected
from the ApolloScape, Kitti, and CityScape datasets to create
photorealistic simulation images and renderings. Urban
mapping and image-based vehicle navigation require
extracted road networks but due to the lack of prior road
datasets, methods of extracting road networks from images
encounter a number of issues [5]. To construct robust virtual
cities, require realistic 3D models of urban built form. But
realistic 3D models in virtual environments require a
sufficient amount of real city-data [6]. The environment with
moving obstacles is not ideal for creating 3D models, and
there is insufficient data that can be found without obstacles
such as vehicles. When there are vehicles on the road, it is
difficult to find road signs such as lanes, crosswalks, stop
lines, and so on, so vehicles tend to be an obstacle to the road
and lane detection research [7]. It is important to ensure the
efficiency of a trained model, and there are times when there
is insufficient data to train the model. The quality, quantity,
and diversity of a data set can influence a model's
performance in a variety of unexpected situations.
Sometimes it is difficult to collect data from real life. So, in
order to build a diverse image dataset, Image augmentation is
considered. Image augmentation is the artificial creation of
training images using various processing methods that are
typically required to improve the performance of AI models
[8].
The proposed methodology in this paper combines image
segmentation and image inpainting to augment existing road
images and create an obstacle (vehicle) free road dataset that
can be used by other studies where empty road data is
required. The Mask RCNN method was used to segment
vehicles from road images and create masks for them. Mask-
RCNN is a popular object detection and instance
segmentation method that produced cutting-edge results on
the MSCOCO [10] dataset [9]. The masks are then removed
from the images using the image inpainting method. The
applications of inpainting are diverse, ranging from the
restoration of damaged paintings and photographs to the
2021 International Conference on Intelligent Technologies (CONIT)
Karnataka, India. June 25-27, 2021
978-1-7281-8583-5/21/$31.00 ©2021 IEEE
1
removal/replacement of selected objects [11]. The inpainting
method detects adjacent pixels in the masked area and fills
them to remove the vehicle from the road images. Lastly, the
similarity ratio of the method Input image and the output
image is calculated. The details are elaborated in the
methodology and result sections of the study.
II. BACKGROUND STUDY
In this paper, the author [12] worked on detecting
unattained objects in surveillance camera systems.
Foreground blob extraction is used for getting two timely
updating backgrounds. Upon the succession of the method,
the proposed algorithm identifies a removed or forgotten
object by a person from the previous frame. Although the
object detection of the method is very accurate, it only
worked with humans who are standing still. In this paper [13]
a method of removing natural objects is proposed. Here a
target detection algorithm was proposed which was based on
contour transformation. In this paper [14] the authors have
discussed various techniques to execute the image inpainting
method. Here the approaches of performing the inpainting
method in various cases were discovered and their
drawbacks like performance, output, and efficiency were
introduced. Although the study showcases the
implementation of inpatient methods in various projects, it
lacks the data or actual visualization of output comparison to
fully understand the mentioned techniques of image
inpainting. The authors of this [15] paper introduced a
method of removing unwanted objectives or items from an
input image. They have used viola jones object detection
method for facial recognition which helped them identify the
main subject of the frame and using SVM the proposed
model crops the image keeping the main subject of the image
only. This method is very efficient if the unsolicited object is
in a corner of the image. For using the crop method, the
model will not be very efficient in circumstances like
removing an item in the middle of the image. In another
study [16] an algorithm was proposed for removing rain
from images. They decomposed the input image and used
edge detection to extract a mask that will capture the details
of the rain removal image. Then using a defogging
algorithm, they processed the rain.
This method proposed [17] an improved version of mask
RCNN to specify a robot’s target component capture. They
have changed Mask RCNN and replaced it with Light Head
Mask RCNN network to increment RCNN subnet and ROI
wrapping. This allows the system to reduce complexity and
get a bit of better detection result compared to Mask RCNN
but this compromises the real-time detection speed and
efficiency that the Mask RCNN was providing. In this paper
[18] the authors have developed a system of automatically
reading meter values from an image. Mask RCNN was
modified to read all sorts of digital values from an image. In
the following study [19] an implementation of mask RCNN
was studied. The author used Mask RCNN to do the object
localization and segmentation of nuclei. They modified the
mask of RCNN with ResNet-101-FPN on BBBC038v1
image which contains microscopic nuclei images. In the
following study [20] another implementation of mask RCNN
was conducted. With the help of Fmask-RCNN they have
figured out the process of automatically identifying dents,
convex, hole, damage, and distortion in containers. They
have modified the mask of detected container damages using
Res2Net101 structure, flip fuzzy data, fusion unsampling,
etc. which helps them to correctly identify the various kinds
of damages in containers with the help of a single image
frame. The following paper [21] proposed a model that uses
Mask RCNN and stereo vision to identify various objects
such as bananas, apples, oranges from various distances. The
method not only identifies the items but also can name the
item that is being identified. The image set was collected
from the COCO17 dataset. In this study [22] the authors
have proposed a deep neural network model to prepare an
inpainting method. They have separated inpainting into two
divisions namely structure reconstruction and texture
generation. The first part completes the missing structure and
the other part generates the texture. In this study [23] the
authors have proposed a method to inpaint objects from a
video frame and single image. The method converts the
video into a number of frames. Then after foreground
extraction, the method uses graph-based region
segmentation. Finally using their proposed method, they
accomplished the inpaing and convert the frame into videos.
This study [24] proposes a method of removing snow or rain
from a single image. Using image decompositioning
dictionary learning they have extracted the rain and snow
portion of the image and other details of the image. In this
study [25] the authors have demonstrated the importance of
data augmentation in a deep learning image classifier model.
They have shown the lack of training data for image
classification and comparison of available methods of data
augmentation. Later they have proposed their own method of
data augmentation and the use of the generated output to
train a deep learning model. In another study [26] the authors
have enhanced a small data set to a larger data set for model
training. They have successfully generated masks of
microscopic objects and implement data augmentation to
prepare photo-realistic images of blood cells. In this research
[27] the authors have used data augmentation for training
disease detection intelligent model. They have stated the
issues with open-source data set and proposed a method of
creating their own data set for training their model. They
have used simple augmentation tactics like crop, rotation,
image scaling, zoom, channel shift etc. The limitation of
medical-related intelligence due to lack of data is also briefed
in a study [28]. So they proposed image synthesis to increase
the data set. They used two-stage generative adversarial
networks (GAN) to generate a binary mask and later in
another step used GAN for conditional generation of
synthesized image. Later they have used these images to
generate more training datasets. The researchers [29] of this
study proposed a library to do image inpainting. They have
implemented their image augmentation method on various
kinds of images likemap images, animal images, landscape
images, etc. that are commonly required. This proposed
method is a deep neural networking model that is enhanced
by image augmentation. They have used image augmentation
to detect malware families in a malware environment.
III. METHODOLOGY
The project follows a formal procedure shown in Figure
1 to generate the output image. The process starts when mask
RCNN is applied to the inputted image that returns the mask
of the car object. Now, dilation is applied to the white mask.
Finally, image inpainting is done with the mask.
2
Fig. 1. Flow chart of image inpainting.
A. Mask RCNN
Mask R-CNN (Region-Based Convolutional Neural
Network) is an extension of Faster R-CNN. Mask RCNN
adds a branch to Faster RCNN. ROI pooling was used in
Faster RCNN. To produce precise mask pixel-level
segmentation is changed to ROI align instead of ROI
polling. After the mask is generated, Mask RCNN uses
classification and bounding box of Faster RCNN to generate
precise segmentation. Mask RCNN can detect and separate
different objects from a single image. It generates proposals
regarding where an object might be and then predicts the
class of the objects. After detecting the objects, it generates
a bounding box of that image. Additionally, the model
generates masks of the detected object. The underneath
Figure 2(a) shows the inputted image. From the image, the
Mask RCNN can detect several objects, like shown in
Figure 2(b), where a potted plant and a car are detected, and
a bounding box is created for both the objects. After
detecting the car object, a mask of the car is generated,
which is shown in Figure 3.
Fig. 2. a.Inputted image, Fig 2b: Generating bounding box after object
detectionInputted image
Fig. 3. Mask of the Car object
B. Morphological transformation & image inpainting
After generating the precise mask of the image, direct
image inpainting was used. The output is given below in
Figure 5. After simply evaluating the picture that the result of
the inpainting is not very good and the car is still partially
visible even after the image inpainting.
To produce better results, morphological transformation
was done on the mask. Morphological transformation is a
simple operation done on binary images based on image
shape. To do the transformation only the input image and a
kernel is required. A 4x7 kernel and dilation method was
used to do the transformation. The result is given in the
image Figure 4.
Fig. 4. Mask of the car after dilation
After generating the mask of the object, image inpainting
is done on the image. Image inpainting is the method of
removing unwanted elements from an image like noises,
strokes, texts, or like in this study an object. This method is
generally used to restore old damaged photos. The method
works by replacing the unwanted pixels with the neighboring
pixels. The inpaint method needs the input image and mask
of the image. The output of the function with(Figure-5) and
without morphological transformation(Figure -6) is given
below.
Fig. 5. Image inpainting without
dilation
Fig. 6. Output of inpainting
after using dilated mask
C. Output Validation
To validate how accurately the system works, a
validation method was introduced. To calculate similarity
percentage an image of the same frame without the car object
(Figure 7) and the output after inpainting using
morphologically transformed mask (Figure 5) was compared.
The similarity percentage is used to find out how similar the
image would have been if the car was not actually present in
that frame. The higher the similarity rate is, the high the
proposed model’s success rate is. Using the python
openCV’s SIFT (Scale Invariant Feature Transform) number
of key points of both images was calculated. Then good
match points of both images were found using openCV’s
KNNMatch function. Now using the ratio of good match
point and number of key points, the accuracy percentage was
calculated.
Fig. 7. Image without car taken from the same frame
IV. RESULT AND ANALYSIS
The mask RCNN is used to detect the object and generate
a mask from the object. The object detection rate of the mask
3
RCNN is very good. In Figure 3(2(a)) the confidence rate of
Car in the image is shown. The confidence rate that the
detected object is a car is 98.7%.
Next the Morphological transformation has an important
impact on the method. In the image 5 & 6 the side of side
comparison of the inpainted result without and with
transformation is shown. In the first figure (figure 5) where
the inpaining is done without dilation fails to generate good
result. The car shape is partially visible. On the other hand, if
the method the method uses the transformed mask then the
output looks very good.
So, if the model used the generated mask to do the image
inpainting then the accuracy rate of the model was
93% (shown in figure 8). But after using the transformed
mask, not only the accuracy of the model has increased but
also the partially present car is not visible anymore. The
reason behind the inaccuracy is the black shadow of the car
in the bottom portion of the detected object. Mask RCNN
doesn’t count the shadow of the image as a part of
segmentation so it leaves out that portion. Image inpainting
works by replacing the neighboring pixels. Thus the black
shadow of the car impacts the result of the model. After
image dilation, the mask is increased and the shadow portion
gets also added with the mask. Consequently, this provides
much better output than compared to the mask without
dilation. The inpainting gives an accuracy of 96% after
applying dilation to the mask. After inpainting the accuracy
rate of the output image is checked using the previously
discussed method. Thus an optimal output of the inpainting
is ensured by applying dilation where the without dilation the
inpainting fails to generate good output. Again in the Figure
10, the difference of the input image and output image are
compared and the difference is marked as red mask. It is
clearly visible that a red mask of car is generated as result.
Some of the demo outputs are given below (figure 11).
Fig. 8. Accuracy of inpainting without dilation
Fig. 9. Accuracy of inpainting with dilation
Fig. 10. Diffrence between input and ouput picture
V. NOVELTY OF THE WORK
The method explained in this paper can be used to
increase the size of any dataset with images of roads with
vehicles. If a model needs road image data without vehicles
to train, then using this method the image data with vehicles
can be turned into images of road without vehicles. Often the
models for self-driving cars are trained in a simulated
environment. These kinds of simulation environments can be
improved by scaling up. Data-driven algorithms can scale up
and improve their model by generating more data using the
method explained in this paper and providing more data to
the model. It has been seen that fusion of large-scale driving
dataset and employed statistical models can improve
efficiency by ten thousand times [30]. But if a new
researcher wants this appealing result they need around 8.8
billion driving miles to show enough evidence to compare
the safety of the autonomous vehicle and human driving
from logged data. [30] So researchers can apply our method
to existing dataset to increase the image data. Particularly
researchers who do not have a source of a large amount of
data can use this method to generate more data for their
model.
VI. CONCLUSION
As per results, the proposed system is a robust method
that removes the vehicles from the road data set. This allows
other researchers to get better training data set for their
respective model. Again the accuracy of each section is very
evidential. From object detection to image inpainting the
accuracy is very good. Some of the demo output results are
given below. In the coming future this data generation
technique can be used to train various fields like virtual
world construction, self-driving cars and so on.
REFERENCES
[1] Li, W., Pan, C., Zhang, R., Ren, J., Ma, Y., Fang, J., Yan, F., Geng,
Q., Huang, X., Gong, H., Xu, W., Wang, G., Manocha, D. and Yang,
R., 2019. AADS: Augmented autonomous driving simulation using
data-driven algorithms. Science Robotics, 4(28), p.eaaw0863.J. Clerk
Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2.
Oxford: Clarendon, 1892, pp.6873.
[2] A. Z. M. Tahmidul Kabir, A. M. Mizan, N. Debnath, A. J. Ta-sin, N.
Zinnurayen and M. T. Haider, "IoT Based Low Cost Smart Indoor
Farming Management System Using an Assistant Robot and Mobile
App," 2020 10th Electrical Power, Electronics, Communications,
Controls and Informatics Seminar (EECCIS), Malang, Indonesia,
2020, pp. 155-158, doi: 10.1109/EECCIS49483.2020.9263478.
[3] A. M. Mizan, A. Z. M. Tahmidul Kabir, N. Zinnurayen, T. Abrar, A.
J. Ta-sin and Mahfuzar, "The Smart Vehicle Management System for
Accident Prevention by Using Drowsiness, Alcohol, and Overload
Detection," 2020 10th Electrical Power, Electronics,
Communications, Controls and Informatics Seminar (EECCIS),
Malang, Indonesia, 2020, pp. 173-177, doi:
10.1109/EECCIS49483.2020.9263429.
[4] M. E. Raihan, U. Rafin Akther, S. Afrin, F. M. Chowdhury and M.
Rawnak Sarker, "Toddlers Working Memory Development via Visual
Attention and Visual Sequential-Memory," 2019 22nd International
Conference on Computer and Information Technology (ICCIT),
Dhaka, Bangladesh, 2019, pp. 1-6, doi:
10.1109/ICCIT48885.2019.9038580.
[5] B. Chen, W. Sun and A. Vodacek, "Improving image-based
characterization of road junctions, widths, and connectivity by
leveraging OpenStreetMap vector map," 2014 IEEE Geoscience and
Remote Sensing Symposium, Quebec City, QC, Canada, 2014, pp.
4958-4961, doi: 10.1109/IGARSS.2014.6947608.
[6] Chen, F. Huang and Y. Fang, "Integrating virtual environment and
GIS for 3D virtual city development and urban planning," 2011 IEEE
International Geoscience and Remote Sensing Symposium,
Vancouver, BC, Canada, 2011, pp. 4200-4203, doi:
10.1109/IGARSS.2011.6050156.
[7] J. Kim, J. Yoo and J. Koo, "Road and Lane Detection Using Stereo
Camera," 2018 IEEE International Conference on Big Data and Smart
Computing (BigComp), Shanghai, China, 2018, pp. 649-652, doi:
10.1109/BigComp.2018.00117.
4
[8] M. C. Olgun, Z. Baytar, K. M. Akpolat and O. Koray Sahingoz,
"Autonomous Vehicle Control for Lane and Vehicle Tracking by
Using Deep Learning via Vision," 2018 6th International Conference
on Control Engineering & Information Technology (CEIT), Istanbul,
Turkey, 2018, pp. 1-7, doi: 10.1109/CEIT.2018.8751764.
[9] Kaiming He, Georgia Gkioxari, Piotr Dollar, Ross Girshick;
Proceedings of the IEEE International Conference on Computer
Vision (ICCV), 2017, pp. 2961-2969.
[10] Lin TY. et al. (2014) Microsoft COCO: Common Objects in Context.
In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds) Computer
Vision ECCV 2014. ECCV 2014. Lecture Notes in Computer
Science, vol 8693. Springer, Cham. https://doi.org/10.1007/978-3-
319-10602-1_48.
[11] Marcelo Bertalmio, Guillermo Sapiro, Vincent Caselles, and Coloma
Ballester. 2000. Image inpainting. In Proceedings of the 27th annual
conference on Computer graphics and interactive techniques
(SIGGRAPH '00). ACM Press/Addison-Wesley Publishing Co.,
USA, 417424. DOI:https://doi.org/10.1145/344779.344972.
[12] L. H. Jadhav and B. F. Momin, "Detection and identification of
unattended/removed objects in video surveillance," 2016 IEEE
International Conference on Recent Trends in Electronics,
Information & Communication Technology (RTEICT), Bangalore,
India, 2016, pp. 1770-1773, doi: 10.1109/RTEICT.2016.7808138.
[13] L. Chonglun, "Removing Natural Objects from the Sea Surface
Background Image Based on Contour Map and Local-Hausdorff
Distance," 2016 3rd International Conference on Information Science
and Control Engineering (ICISCE), Beijing, China, 2016, pp. 519-
526, doi: 10.1109/ICISCE.2016.118.
[14] M. Mahajan and P. Bhanodia, "Image inpainting techniques for
removal of object," International Conference on Information
Communication and Embedded Systems (ICICES2014), Chennai,
India, 2014, pp. 1-4, doi: 10.1109/ICICES.2014.7034008.
[15] N. Shan, D. S. Tan, M. S. Denekew, Y. -Y. Chen, W. -H. Cheng and
K. -L. Hua, "Photobomb Defusal Expert: Automatically Remove
Distracting People From Photos," in IEEE Transactions on Emerging
Topics in Computational Intelligence, vol. 4, no. 5, pp. 717-727, Oct.
2020, doi: 10.1109/TETCI.2018.2865215.
[16] J. Liu, S. Teng and Z. Li, "Removing Rain from Single Image Based
on Details Preservation and Background Enhancement," 2019 IEEE
2nd International Conference on Information Communication and
Signal Processing (ICICSP), Weihai, China, 2019, pp. 322-326, doi:
10.1109/ICICSP48821.2019.8958586.
[17] J. Shi, Y. Zhou and W. X. Q. Zhang, "Target Detection Based on
Improved Mask Rcnn in Service Robot," 2019 Chinese Control
Conference (CCC), Guangzhou, China, 2019, pp. 8519-8524, doi:
10.23919/ChiCC.2019.8866278.
[18] A. Azeem, W. Riaz, A. Siddique and U. A. K. Saifullah, "A Robust
Automatic Meter Reading System based on Mask-RCNN," 2020
IEEE International Conference on Advances in Electrical
Engineering and Computer Applications( AEECA), Dalian, China,
2020, pp. 209-213, doi: 10.1109/AEECA49918.2020.9213531.
[19] Johnson J.W. (2020) Automatic Nucleus Segmentation with Mask-
RCNN. In: Arai K., Kapoor S. (eds) Advances in Computer Vision.
CVC 2019. Advances in Intelligent Systems and Computing, vol 944.
Springer, Cham. https://doi.org/10.1007/978-3-030-17798-0_32.
[20] Li X., Liu Q., Wang J., Wu J. (2020) Container Damage Identification
Based on Fmask-RCNN. In: Zhang H., Zhang Z., Wu Z., Hao T. (eds)
Neural Computing for Advanced Applications. NCAA 2020.
Communications in Computer and Information Science, vol 1265.
Springer, Singapore. https://doi.org/10.1007/978-981-15-7670-6_2.
[21] M. Songhui, S. Mingming and H. Chufeng, "Objects detection and
location based on mask RCNN and stereo vision," 2019 14th IEEE
International Conference on Electronic Measurement & Instruments
(ICEMI), Changsha, China, 2019, pp. 369-373, doi:
10.1109/ICEMI46757.2019.9101563.
[22] Y. Ren, X. Yu, R. Zhang, T. H. Li, S. Liu and G. Li, "StructureFlow:
Image Inpainting via Structure-Aware Appearance Flow," 2019
IEEE/CVF International Conference on Computer Vision (ICCV),
Seoul, Korea (South), 2019, pp. 181-190, doi:
10.1109/ICCV.2019.00027.
[23] D. J. Tuptewar and A. Pinjarkar, "Robust exemplar based image and
video inpainting for object removal and region filling," 2017
International Conference on Intelligent Computing and Control
(I2C2), Coimbatore, India, 2017, pp. 1-4, doi:
10.1109/I2C2.2017.8321964.
[24] Y. Wang, S. Liu, C. Chen and B. Zeng, "A Hierarchical Approach for
Rain or Snow Removing in a Single Color Image," in IEEE
Transactions on Image Processing, vol. 26, no. 8, pp. 3936-3950,
Aug. 2017, doi: 10.1109/TIP.2017.2708502.
[25] A. Mikołajczyk and M. Grochowski, "Data augmentation for
improving deep learning in image classification problem," 2018
International Interdisciplinary PhD Workshop (IIPhDW),
Świnouście, Poland, 2018, pp. 117-122, doi:
10.1109/IIPHDW.2018.8388338.
[26] O. Bailo, D. Ham and Y. M. Shin, "Red Blood Cell Image Generation
for Data Augmentation Using Conditional Generative Adversarial
Networks," 2019 IEEE/CVF Conference on Computer Vision and
Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA,
2019, pp. 1039-1048, doi: 10.1109/CVPRW.2019.00136.
[27] Gorad, B. & Kotrappa, D. S. Novel Dataset Generation for Indian
Brinjal Plant Using Image Data Augmentation IOP Conference
Series: Materials Science and Engineering, IOP Publishing, 2021,
1065, 012041.
[28] Pandey, S.; Singh, P. R. & Tian, J. An image augmentation approach
using two-stage generative adversarial network for nuclei image
segmentation Biomedical Signal Processing and Control, 2020, 57,
101782.
[29] Buslaev, A.; Iglovikov, V.I.; Khvedchenya, E.; Parinov, A.;
Druzhinin, M.; Kalinin, A.A. Albumentations: Fast and Flexible
Image Augmentations. Information 2020, 11, 125.
https://doi.org/10.3390/info11020125.
[30] N. Kalra and S. M. Paddock, “Driving to safety: How many miles of
driving would it take to demonstrate autonomous vehicle reliability?”
Transportation Research Part A: Policy and Practice, vol. 94, pp. 182
193, 2016.
5
A
B
C
D
E
Fig 11. A. confidance rate of identifying the vehicle to remove, B.Mask of the car without dilation, C. Output after inpainting using mask without dilation, D.
Mask of the vehicle after dilation , E. output after inpating using mask with dilation
6
... Morphological transformations were also adjusted to the output, and multiple iterations were sometimes required to improve the results. (14) The use of a deep learning algorithm to automatically detect road potholes was proposed by Rohitaa et al. Three deep learning models, namely, CNN, Mask R-CNN, and YOLOv3, were trained and tested using a dataset. ...
Article
Full-text available
The tire sidewall is the weakest part of the entire tire. Although the tire sidewall is not directly in contact with the ground, it often undergoes great deformation. Weather, road conditions, and driving habits can also affect the tire life. Cracking is one of the earliest signs of tire aging and deterioration. If a driver does not regularly inspect their vehicle, damage to a tire may remain undetected and an uncontrolled tire explosion may occur. In this study, we use deeplearning-based artificial intelligence computer vision to train a deep neural network model using a large number of digital images to detect tire sidewall cracks instead of raditional sensors, inspection devices, or visual inspection methods. In this study, tire sidewall crack images were preprocessed and nnotated using the annotation program VGG Image Annotator (VIA). Residual network 50 (ResNet50) is used as the backbone of mask-region-based convolutional neural networks (Mask R-CNNs). The preprocessing training and test results of our dataset show that the improved Mask R-CNN has better mean accuracy (mAP) and detection accuracy than the original Mask R-CNN and Faster-R-CNN and can not only reduce inspection costs and time, but also improve the efficiency of tire crack analysis.
Article
Full-text available
Abstract. Machine learning and deep learning have performed outstandingly in many computer tasks to tackle various real-world problems like disease prediction on plants etc. But unfortunately, there are many research areas like disease prediction in the agricultural crops where there is a lack of large-good quality real-world datasets. One way to solve such a problem is by using an available dataset from the internet. The problem of using an available dataset from the internet creates lots of issues. Major issues are using the dataset from different geographical locations which are deployed at other location, model overfitting due to smallsized dataset etc. The main purpose and experimentation done in this research paper are presenting different techniques to increase the size of the Indian Brinjal dataset so that deep learning models can be improved. Here data augmentation techniques to enhance the smallsized image dataset using rotation, channel shift, width shift, height shift, shear transform, brightness, scaling, uniform aspect ratio, zoom, horizontal flipping, and vertical flipping methods are used. At last, a huge high-quality training dataset of size 39,010 is generated from 350 sample images taken from the real field and 1356 high-quality images are generated to validate the model using above mentioned data augmentation techniques.
Conference Paper
Full-text available
As the population grows and the quality of life of the people improves, leading to heightened demand for salubrious food. As a result, indoor farming has become a very popular day by day and the use of technology in agriculture is noteworthy. The most beneficial feature of indoor farming is that it can produce healthy food using less farmland and workers. In this paper, a system will be acquainted through which it is possible to manage an indoor farm automatically at a very low cost. Whereby it is possible to water the farm plants when required, provide specific light to each plant for photosynthesis, constrain the concentration of CO2 on the farm, which is suitable for the plants, etc. In addition, the whole arrangement can be managed from any location through a mobile app. This system comprises an ancillary robot that provides fertilizer to the farm's plants and real-time monitoring of the entire farm. The robot permits the user to define its task in advance or users can give instructions from any place at any time by using the app.
Conference Paper
Full-text available
The principal intent behind the titled project is to make a smart vehicle management system that will prompt us to reduce traffic accidents. There are three major elements in this prototype of the smart vehicle management system. Firstly, there is a drowsiness detector, which will identify the drowsiness of the driver throughout driving time. Secondly, an alcohol detector will trace if there is any presence of alcohol on the body of that particular driver. Lastly, there is an overload detector, which will show if the vehicle is overloaded, or not.
Article
The major challenge in applying deep neural network techniques in the medical imaging domain is how to cope with small datasets and the limited amount of annotated samples. Data augmentation procedures that include conventional geometrical transformation based augmentation techniques and the recent image synthesis techniques using generative adversarial networks (GANs) can be employed to artificially increase the number of training images. This paper is focused on data augmentation for image segmentation task, which has an inherent challenge when compared to the conventional image classification task, due to its requirement to produce a corresponding mask for each generated image. To tackle the challenge of image-mask pair augmentation for image segmentation, this paper proposes a novel two-stage generative adversarial network. The proposed approach first employs a GAN to generate a synthesized binary mask, then incorporates this synthesized mask into the second GAN to perform a conditional generation of the synthesized image. Thus, these two GANs collaborate to generate the synthesized image-mask pairs, which are used to improve the performance of the conventional image segmentation approaches. The proposed approach is evaluated using the cell nuclei image segmentation task and demonstrates the superior performance to outperform both the traditional augmentation methods and the existing GAN-based augmentation methods in extensive results conducted using the benchmark Kaggle cell nuclei image segmentation dataset.
Chapter
The inspection of container body damage is an inevitable detection work for containers entering the terminal of the port area. In the past, this work was manually recorded, with hidden safety risks and inaccurate handling. To solve this problem, the paper proposes a Fmask-RCNN model. It is based on Mask-RCNN, introducing Res2Net101 framework and adding path fusion augmentation, multiple fully connected layers, fusion upsampling or enhancement. Fmask-RCNN model is applied to the identification of port container damage, and 4407 samples are trained and tested, including 2537 training sets and 1870 testing sets which are untrained. The miss rate of the final damage identification was 4.599% and the error rate was 18.887%.