ArticlePDF Available

Insect Detection and Classification Based on an Improved Convolutional Neural Network

Authors:

Abstract and Figures

Regarding the growth of crops, one of the important factors affecting crop yield is insect disasters. Since most insect species are extremely similar, insect detection on field crops, such as rice, soybean and other crops, is more challenging than generic object detection. Presently, distinguishing insects in crop fields mainly relies on manual classification, but this is an extremely time-consuming and expensive process. This work proposes a convolutional neural network model to solve the problem of multi-classification of crop insects. The model can make full use of the advantages of the neural network to comprehensively extract multifaceted insect features. During the regional proposal stage, the Region Proposal Network is adopted rather than a traditional selective search technique to generate a smaller number of proposal windows, which is especially important for improving prediction accuracy and accelerating computations. Experimental results show that the proposed method achieves a heightened accuracy and is superior to the state-of-the-art traditional insect classification algorithms.
Content may be subject to copyright.
sensors
Article
Insect Detection and Classification Based
on an Improved Convolutional Neural Network
Denan Xia 1, Peng Chen 1, 2, *, Bing Wang 3, Jun Zhang 4,* and Chengjun Xie 5
1School of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, China;
ahu0086@163.com
2Institute of Physical Science and Information Technology, Anhui University, Hefei 230601, Anhui, China
3School of Electrical and Information Engineering, Anhui University of Technology,
Ma’anshan 243032, Anhui, China; wangbing@ustc.edu
4School of Electrical Engineering and Automation, Anhui University, Hefei 230601, Anhui, China
5Institute of Intelligent Machines, and Hefei Institutes of Physical Science, Chinese Academy of Sciences,
Hefei 230031, Anhui, China; cjxie@iim.ac.cn
*Correspondence: pchen@ahu.edu.cn (P.C.); wwwzhangjun@163.com (J.Z.); Tel.: +86-551-6386-1469 (P.C.)
Received: 28 August 2018; Accepted: 16 November 2018; Published: 27 November 2018


Abstract:
Regarding the growth of crops, one of the important factors affecting crop yield
is insect disasters. Since most insect species are extremely similar, insect detection on field
crops, such as rice, soybean and other crops, is more challenging than generic object detection.
Presently, distinguishing insects in crop fields mainly relies on manual classification, but this is
an extremely time-consuming and expensive process. This work proposes a convolutional neural
network model to solve the problem of multi-classification of crop insects. The model can make full
use of the advantages of the neural network to comprehensively extract multifaceted insect features.
During the regional proposal stage, the Region Proposal Network is adopted rather than a traditional
selective search technique to generate a smaller number of proposal windows, which is especially
important for improving prediction accuracy and accelerating computations. Experimental results
show that the proposed method achieves a heightened accuracy and is superior to the state-of-the-art
traditional insect classification algorithms.
Keywords:
convolutional neural network; insect detection; field crops; region proposal network; VGG19
1. Introduction
Insects are known to be a major factor in the world’s agricultural economy, therefore it is
particularly crucial to prevent and control agricultural insects [
1
], through the use of programs
such as dynamic surveys and insect population management by real-time monitoring systems [
2
].
However, there are many species of insects in farmlands, which requires a lot of time for manual
classification by insect experts [
3
]. It is well known that different species of insects might have
similar phenotypes, and insects often take on complicated phenotypes due to different environments
and growth periods [
4
,
5
]. Since people without the knowledge of entomology cannot distinguish
insect categories and the growth period of insects, it is necessary to develop more rapid and effective
approaches to tackle this problem.
The development of machine learning algorithms has provided an excellent solution for
insect image recognition [
6
8
]. Computer vision [
9
] and machine learning methods have achieved
great successes in vehicle identification and pedestrian detection. Li et al. [
10
] combined convolutional
neural networks (CNN) with an edge boxes algorithm to accurately recognize pedestrians in images.
Several issues have to be addressed in the process of the recognition and classification of insects,
however, which are briefly described as follows:
Sensors 2018,18, 4169; doi:10.3390/s18124169 www.mdpi.com/journal/sensors
Sensors 2018,18, 4169 2 of 12
1) Quickly locate the information of an insect positioned in a complex background;
2) Accurately distinguish insect species with high similarity between intra-class and inter-class;
3)
Effectively identify the different phenotypes of the same insect species in different growth periods.
Xie et al. [
7
] combined a sparse-coding technique for encoding insect images with a multiple-kernel
learning (MKL) technique to construct an insect recognition system, which achieved a mAP
(mean average precision) of 85.5% on 24 common insects in crop fields [
11
]. Xie’s method
requires multi-image preprocessing, however, such as image denoising and segmentation [
12
,
13
],
which expends a lot of time and technical support, so predictions on images without preprocessing
might not be satisfactory. Lim et al. adopted Alexnet and Softmax to build an insect classification
system, which was optimized by adjusting the network architecture [
14
]. Yalcin et al. [
15
] proposed
an image-based insect classification method by using four feature extraction methods: Hu moments
(Hu), Elliptic Fourier Descriptors (EFD), Radial Distance Functions (RDF) and Local Binary Patterns
(LBP), but these images need preprocessing manually, which is undoubtedly very time consuming.
Pjd et al. [
16
] proposed a prototype automated identification system which distinguishes five parasitic
wasps by identifying wing structure differences. Mayo and Watson [
17
] developed an automatic
identification system using support vector machines to recognize the images of 774 live moths,
without manually specifying the region of interest (ROI). Ding and Taylor [
18
] proposed a neural
network model based on deep learning [
19
] to classify and count the number of moths and achieved
successful results. Moreover, Ding’s work showed that the model can achieve better results under ideal
experimental conditions. The quality of images, however, is often affected by sunlight, obstructions,
etc. Meanwhile, these vague images might affect the recognition and classification of insects.
Traditional machine learning algorithms have certain limitations in the field of image recognition.
Recently, many researchers have found that deep learning takes enormous advantage of feature
extraction in images, implemented through the adaptive learning of artificial neurons, which does
not affect the process of artificially seeking and extracting suitable features. The authors propose
a convolutional neural network model for insect recognition and classification. The work flow of this
model is roughly divided into two stages:
1)
During the first stage, VGG19 [
20
] is adopted, which is a deep network consisting of 19 layers to
extract high-dimensional features from insect images, as well as RPN, which combines highly
abstracted information trained to learn the actual locations of insects in images;
2)
During the second stage, the feature maps are reshaped to a uniform size and converted into
a one-dimensional vector for insect classification.
2. Materials and Methods
2.1. Dataset: Data Preprocessing and Augmentation
The dataset used in Xie’s work [
7
] was adopted in this work, which contains 24 common images of
insects in crop fields, such as Aelia sibirica,Atracto morphasinensis,Chilo suppressalis, etc. Figure 1shows
the scientific names and sample images of 24 insect species. To improve the generalization ability of
this model, more images collected from the Internet were used with a data augmentation technique.
Sensors 2018,18, 4169 3 of 12
Figure 1. Sample images of 24 insect species collected from crop fields.
Due to the small size of Xie’s data set, collecting new images was required. The authors manually
collected images by search engines, such as Baidu and Google, where similar images were extracted
manually. Table 1lists the information of insect species including Xie’s data set and those collected
from crop fields and the Internet. Following exclusion of some images with errors and low quality,
660 images were used in this work, where 60 images were randomly selected for the test data set
and the remaining 540 images for the training one.
Moreover, to avoid over-fitting of this model, data augmentation was performed on the training
data set to increase the number of training samples. Bilinear interpolation [
21
] was adopted to fix
images to the pixel size of 450
×
750, and all images were then rotated at 90
, 180
, 270
angles.
Salt and Pepper Noise [
22
] was also added to the images to ensure the validity of data, which randomly
changes pixel values in the images, whitening some pixel points and blackening some other pixel points.
As a result, these techniques expanded the number of training samples to eight times the original
ones. Meanwhile, an annotation file containing bounding boxes and the categories of each insect were
generated for each image.
Following the data augmentation, the training data set was expanded to 4800 images, where each
species of insect included 200 images. The number for the test data set was 480 where each
species of insect included 20 images. Thus, the insect data set “MPest” were readied for the next
insect identification.
Table 1. Information of 24 insect species collected from Xie’s data set, crop fields and the Internet.
Species Quantity Species Quantity Species Quantity
Aeliasibirica 66 Colposcelissignata 73 Mythimnaseparta 49
Atractomorphasinensis
60 Dolerustritici 91
Nephotettixbipunctatus
66
Chilosuppressalis 53 Erthesinafullo 49 Pentfaleus major 83
Chromatomyiahorticola
51
Eurydemadominulus
128 Pierisrapae 61
Cifunalocuples 47 Eurydemagebleri 42 Sitobionavenae 60
Cletus punctiger 60 Eysacorisguttiger 60 Sogatellafurcifera 71
Cnaphalocrocismedinalis
53
Laodelphaxstriatellua
82
Sympiezomiasvelatus
55
Colaphellusbowvingi
56 Marucatestulalis 56 Tettigellaviridis 55
Sensors 2018,18, 4169 4 of 12
2.2. Deep Learning
Deep learning was proposed by Hinton [
19
] et al. in 2006, which is a learning model with multiple
layers of hidden perceptrons. It combines low-level features with more abstract high-level features
to discover estimable relationships in mass data sets [
23
]. The multi-layer perceptrons are adopted
to explore sophisticated structures with multiple abstract levels. The deep convolutional neural
network seeks hidden relationships in complex data by using the back-propagation algorithm to adjust
the parameters of neurons at each layer. Lecun proposed a multilayer neural network trained with
the back-propagation algorithm, which performed at a lower error rate, on data sets of handwritten
characters [
24
]. Deep learning, being better than the state-of-the-art traditional machine learning
algorithms, has greatly improved the capabilities of image recognition and more.
2.3. Overall CNN Architecture
An improved network architecture was implemented based on VGG19 [
20
]. Figure 2shows
the schematic diagram of this network. The 19-layer CNN network can be thought of as a self-learning
progression of local image features from low to mid to high level. The first 16 convolutional layers of
VGG19 were adopted, which were used to extract features. Higher convolutional layers can reduce
the resolution of the feature map and extract more abstract high-level features. The Region Proposal
Network (RPN) [
25
,
26
] was adopted in the first 16 layers, which can recommend the location of
insects from a feature map and remove the influence of unrelated background on classification results.
Moreover, the last FC6 and FC7 full connection layers were used to capture complex comprehensive
feature information. This architecture is appropriate for learning local features from a complex natural
image dataset [27].
Figure 2. The schematic structure of the proposed detection model based on VGG19.
2.4. Region Proposal Network
Region Proposal Network [
25
,
26
] takes an image of arbitrary size as input, and outputs a set
of rectangle proposal boxes, where each box has an object score. To generate regional proposals,
a small network, which takes an n
×
nspatial window of a convolutional feature map as input,
was moved on the convolutional map of the last shared convolutional output. A 3
×
3sliding window
was selected for convolution mapping and generated a 512-dimensional feature vector. Figure 3shows
the architecture of the Region Proposal Network used in this work. At the location of each sliding
Sensors 2018,18, 4169 5 of 12
window, multiple regional proposals were generated on the current sliding window and corresponded
to various scales and aspect ratios. Here three scales and three aspect ratios were used, which resulted
in k= 9 regional proposals for each slide location. The proposals are also called anchors. The regional
proposals were then input into two full-connected layers—the bounding box regression layer (reg)
and the bounding box classification layer (cls).
Figure 3. Region Proposal Network (RPN).
To train the Region Proposal Network, for one image, the target proposal area in the image
was assigned a plurality of binary class labels (insects or backgrounds), while the remaining
area was discarded. The proposal region was assigned to a positive label, if it had the highest
intersection-over-union (IoU) overlap ratio with the ground truth box; otherwise, it was assigned to
a negative label if the IoU of the proposal region was lower than the IoU threshold of all ground-truth
boxes. The IoU ratio is defined as follows:
IoU =area(Bpest Bground)
area(Bpest Bground), (1)
where area(B
pest
B
ground
)represents the intersection area of the insect proposal box and ground truth
box, and area(Bpest Bground)denotes their union area.
Given one image with insects, it is hoped to extract a fixed-length feature vector for an insect from
a complex image background by convolution operation. The region of interest (ROI) pooling was adopted
to convert insect-like regions into a fixed spatial size which facilitates the generation of the same
dimensional feature vectors, because these insect-like regions have different sizes. Then, each ROI feature
map was input into the FC6 and FC7 fully connected layers whose output was a 4096-dimension feature
vector including the location and category information of the target. The feature vector was then input
into the Softmax layer to identify insects and estimate the insect-like regions simultaneously.
Training Region Proposal Network
Each input image was resized to 450
×
750 pixels as previously discussed. The feature
map was obtained from the input image by the convolution operation of this network.
However, convolution operation lead to huge mathematical operations, which might result in
the problem of gradient explosion or gradient disappearance. Therefore, it was necessary to add
a rectification non-linearities (Relu) layer to activate or suppress the output characteristic diagram
of each convolutional layer. A max pooling layer was added after the second, fourth, seventh,
Sensors 2018,18, 4169 6 of 12
eleventh and fifteenth convolutional layers. Moreover, to prevent network over-fitting, the pre-trained
VGG19 model was adopted to initialize the first sixteen convolutional layers in this model.
2.5. Loss Function
Regarding loss function of this model, different loss functions were employed for the bounding
box regression layer and the bounding box classification layer. The former layer can make an insect-like
category score Pfor each predicted region, while the latter one can output a coordinate vector
loc = (x,y,m,n)
for each predicted region, where xand ydenote the horizontal and vertical coordinates
of the predicted region, respectively, while mand ndenote the width and height of the predicted region.
Subsequent to defining the loss functions for classification and regression, they were combined by
following Girshick’s multi-task loss rule [28] thus, the whole loss function of this proposed model is:
L({αi},{si}) =
i
Lcls (αi,α
i)
Ncls
+
i
α
iLreg (si,s
i)
βNreg , (2)
where
αi
is the predicted probability of anchor ibeing an object; the ground-truth label
α
i
is 1 if
the region box is positive, otherwise
α
i
= 0;
si
denotes the four parameterized coordinates of
the predicted bounding box; and
s
i
is the ground-truth box associated with a positive region box.
β
is the balancing parameter. N
cls
is the mini-batch size and N
reg
is the number of anchor locations.
L
cls
is the classification loss function over two classes (object or background), which is written as follows:
Lcls (αi,α
i) = Log[αiα
i+ (1αi)(1α
i)]. (3)
To determine the regression loss, Lreg denotes a smooth L1loss as:
Lreg (si,s
i) = f(si,s
i)
where f(x) = (0.5x2if |x|<1
|x|0.5 otherwise.
(4)
2.6. Training Overall Model
The proposed model was implemented in combination with VGG19 and RPN models. The weights
of the network were initialized by the pre-training VGG19 model, while the initial learning rate
was 0.001, the momentum was 0.9, and the weight decay was 0.0005. The weights were updated by
stochastic gradient descent trick. VGG19 and RPN can be alternately trained to optimize the model
rather than training two separate networks. During the first step, the RPN network was trained by
learning an image with the location bounding box of insect-like. Next, the predicted region generated
by RPN was input to VGG19 to train itself. Finally, both RPN and VGG19 were trained jointly
and fine-tuned by fixing shared convolutional network layers. Figure 4illustrates the flowchart of
insect recognition and classification.
Sensors 2018,18, 4169 7 of 12
Figure 4.
Flowchart of insect recognition and classification. Abundant images were obtained by taking
photos in crop fields and collecting images by Baidu and Google online, in which insect images are
original and the quality of insect images are uneven. Then, image preprocessing and data augmentation
were applied to form our own dataset “MPest”. The model, which was trained on the dataset “MPest”,
can effectively help to recognize insects and diseases.
3. Experiments and Results
This section shows the evaluation of this method for insect recognition and the analysis of
this proposed model with different parameters in detail. All experiments were implemented on
the framework of Caffe [
29
] and mean Average Precision (mAP) in Everingham’s work [
11
] was used
as an evaluation metric.
3.1. Effects of Feature Extraction Network
To evaluate the effect of feature extraction on the performance of this model, several feature
extraction networks were implemented on the dataset “MPest”, which contains VGG16 with different
convolutional features. Generally, the greater number of convolutional layers the model has, the more
complex features the model can learn from the images. All of the comparison methods performed
160,000 iterations, with the initialized learning rate of 0.001. Figure 5shows the performance
comparison of the three methods on this dataset. It can be seen that the VGG16 network achieved
good performance, but the proposed network achieved an improvement of 3.72% over VGG16.
Regarding the ZF network, it consists of a simple architecture of only five convolutional layers
and 7
×
7 filters, which results in quick feature map shrink and, thus, cannot extract effectively
the multi-faceted features of insects from images. Both VGG16 and VGG19 belong to deep neural
networks, where the former contains sixteen convolutional layers and the latter contains nineteen layers.
The model with high-level convolution layers can effectively enhance the ability of the proposed model
to extract information on insect characteristics under complex backgrounds. Moreover, a 3
×
3 filter
was implemented in VGG because multiple 3
×
3 small filters have more nonlinearity than a 7
×
7 large
filter, which makes the decision function more decisive.
Sensors 2018,18, 4169 8 of 12
Figure 5. Comparison of different feature extraction methods.
Figure 6shows the convolution map of three feature extraction networks on our dataset
“MPest”. The feature maps of the ZF network have a high loss rate of pixel information and the ZF
network blurred images so that it was hard to distinguish insects in images. The feature maps
of VGG16 showed that the pixel information of images was retained perfectly by using the small
filters. Although the feature maps of VGG19 and VGG16 are almost the same, the more abstract
high-dimensional information was retained since more convolutional layers were applied in VGG19.
Figure 6.
Visualization of feature maps of different feature extraction networks. (
a
): ZF Net; (
b
) VGG16;
(c) VGG19.
3.2. Effects of Iou Threshold
The goal of this proposed training task is to reduce the difference between predicted region
and ground truth, therefore reducing the influence of unrelated background noise in the predicted
region is immensely significant. The IoU threshold score was varied from 0.3 to 0.8 in step sizes of 0.1.
Figure 7shows the mAP curve increased gradually with the IoU threshold, from 0.3 to 0.5.
Sensors 2018,18, 4169 9 of 12
Figure 7. Effects of IoU threshold (VGG19).
That is to say, the increase of the threshold results in abandoning more predicted regions
which overlap less with the ground truth. The mAP curve reaches maximum value when the threshold
is set to 0.5, where 89.22% of the ground truth are detected successfully. Starting from 0.6 to 0.8,
the curve declines slowly. The larger the value of IoU is, the more predicted regions the model
abandoned in the regional proposal stage. Therefore, the lower threshold results in an excessively
small overlap area between the predicted region and ground truth, where more backgrounds were
present in the classification task. Higher thresholds lead to larger discarded prediction regions,
which results in an unsuccessful training model.
3.3. Effects of Learning Rate
Learning rate is an important hyperparameter that controls the update speed of network weights.
The effect of the hyperparametric learning rate was investigated from 0.0006 to 0.0014. The learning
rate technique of this work is not adaptive. This proposed model first sets an initial learning rate
and then decreases it 10 times every 20,000 iterations. Figure 8shows the mAP of this model with
respect to the learning rate.
Figure 8. Effects of Learning Rate (VGG19).
Sensors 2018,18, 4169 10 of 12
Looking at Figure 8, it can be seen that, as the learning rate increases, the error of the model
gradually decreases. Moreover, a small learning rate leads to a slow update speed of weights of
the model and, subsequently, the convergence of the model is not ideal. Therefore, as the learning
rate increases, the experimental error of the model gradually decreases. When the learning rate
exceeds 0.001, the mAP of the proposed model begins to decrease. A higher learning rate means
that the convergence speed of the model is too fast, which results in a larger loss value than expected
in the iterative process and makes the model over-fitting.
3.4. Performance Comparison with Other Methods
Table 2shows the performance comparison with other state-of-the-art methods to this study’s test
set, such as Single Shot Multibox Detector (SSD) and Fast Region-based Convolutional Neural Network
(RCNN). It is known that SSD is a prominent algorithm in which the recurrent feature-pyramid structure
is employed to detect images. Several separate predictors were adopted to perform classification
and regression tasks at the same time in the multi-feature mapping of the network and the processing of
the target detection problem using multi-feature information. The SSD yielded the best performance
when it ran 30,000 iterations with a learning rate of 0.001. Obviously, the proposed method outperformed
the SSD model and achieved an improvement in mAP of 3.73%. The inference time of the proposed
method was the least among the three methods, about 0.083 s. Moreover, the training time of SSD took
38 hours, which was longer than the proposed method.
Table 2. Comparison with Other Methods.
Method mAP Inference Time(s)/Per Image Training Time(h)
Proposed method
0.8922 0.083 11.2
SSD 0.8534 0.120 38.4
Fast RCNN 0.7964 0.195 70.1
Fast RCNN is a regional proposal and target classification algorithm that adopts selective
search techniques to generate proposal windows. Table 2shows Fast RCNN achieved a mAP
of 0.7964 after 60,000 iterations. However, Fast RCNN took about 70 hours for the training model while
the detection time of each image was about 0.195 s. Therefore, fast RCNN requires more computational
resources and time for insect detection than this proposed method.
Attempts were made to compare the proposed model with the latest network modules and models.
Inception was added to this model, for instance, or different convolution layers of this model
were replaced, but experimental results showed that it changed the accuracy of the model little.
Moreover, Resnet (Residual Neural Network) could not gain a satisfactory result because the pixel size
of images in this work was too large.
Moreover, the differences between this proposed model and other methods can be summarized in
two aspects. First, for the insect data set, insect images were collected under field conditions rather
than under ideal conditions, which offers the proposed model stronger anti-interference capability.
Second, for insect recognition, this proposed method actually can locate insects in images, while most
other methods only implemented image classification. To conclude, this proposed method effectively
can moderate the distraction of human factors and artificial burdens in processing data set.
4. Conclusions and Future Work
A target recognition method based on improved VGG19 for quickly and accurately detecting
insects in images was proposed. Since Caffe library provides a pre-trained VGG19 model,
whose architecture has achieved a successful balance in feature extraction, the current authors
fine-tuned the pre-trained model to train this study’s ideal model. The experimental results on
the current dataset “MPest” showed that this method is faster and more accurate than existing methods.
Sensors 2018,18, 4169 11 of 12
However, there are still some issues in this proposed method, such as target detection error.
Therefore, this method performance can be further improved as follows:
1) The insect database needs to be augmented, which can be manually collected in the future;
2) More appropriate models to extract helpful insect-like areas from images should be tried;
3)
Regarding the classification task, the classification of insects needs to be more detailed,
and the periods of insect growth should be divided. Workers will implement different pest
control measures according to the period of insect growth.
Supplementary Materials:
The source codes are available online at http://deeplearner.ahu.edu.cn/web/cnnPest.htm.
Author Contributions:
Conceptualization, P.C.; Data curation, D.X. and C.X.; Funding acquisition, P.C.;
Investigation, B.W.; Methodology, J.Z.; Project administration, P.C.; Software, J.Z. and C.X.; Supervision, P.C.;
Validation, B.W.; Writing–original draft, D.X.; Writing–review & editing, P.C., B.W. and C.X..
Funding:
This research was funded by the National Natural Science Foundation of China (Nos. 61672035,
61300058, 61872004, 61472282 and 31401293).
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
Estruch, J.J.; Carozzi, N.B.; Desai, N.; Duck, N.B.; Warren, G.W.; Koziel, M.G. Transgenic plants: An emerging
approach to pest control. Nat. Biotechnol. 1997,15, 137–141. [CrossRef] [PubMed]
2.
Fedor, P.; Vaˇnhara, J.; Havel, J.; Malenovsky, I.; Spellerberg, I. Artificial intelligence in pest insect monitoring.
Syst. Entomol. 2009,34, 398–400. [CrossRef]
3.
Hanysyam, M.N.M.; Fauziah, I.; Khairiyah, M.H.S.; Fairuz, K.; Rasdi, Z.M.; Zfarina, M.Z.N.; Elfira, S.E.;
Ismail, R.; Norazliza, R. Entomofaunal Diversity of insects in FELDA Gunung Besout 6, Sungkai, Perak.
In Proceedings of the 2013 IEEE Business Engineering and Industrial Applications Colloquium (BEIAC),
Langkawi, Malaysia, 7–9 April 2013; pp. 234–239.
4. Gaston, K.J. The Magnitude of Global Insect Species Richness. Conserv. Biol. 2010,5, 283–296. [CrossRef]
5.
Siemann, E.; Tilman, D.; Haarstad, J. Insect species diversity, abundance and body size relationships.
Nature 1996,380, 704–706. [CrossRef]
6.
Zhang, H.; Huo, Q.; Ding, W. The application of AdaBoost-neural network in storedproduct insect
classification. In Proceedings of the IEEE International Symposium on It in Medicine and Education,
Xiamen, China, 12–14 December 2009; pp. 973–976.
7.
Xie, C.; Zhang, J.; Li, R.; Li, J.; Hong, P.; Xia, J.; Chen, P. Automatic classification for field crop insects
via multiple-task sparse representation and multiple-kernel learning. Comput. Electron. Agric.
2015
,
119, 123–132. [CrossRef]
8.
Xie, C.; Wang, R.; Zhang, J.; Chen, P.; Li, R.; Chen, T.; Chen, H. Multi-level learning features for automatic
classification of field crop pests. Comput. Electron. Agric. 2018,152, 233–241. [CrossRef]
9.
Szeliski, R. Computer Vision: Algorithms and Applications; Springer: London, UK, 2010; Volume 21,
pp. 2601–2605.
10.
Li, H.; Wu, Z.; Zhang, J. Pedestrian detection based on deep learning model. In Proceedings of
the International Congress on Image and Signal Processing, Biomedical Engineering and Informatics, Datong,
China, 15–17 October 2017; pp. 796–800.
11.
Everingham, M.; Eslami, S.M.A.; Gool, L.V.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal, Visual Object
Classes Challenge: A Retrospective. Int. J. Comput. Vis. 2015,111, 98–136. [CrossRef]
12.
Lin, D.; Lu, C.; Huang, H.; Jia, J. RSCM: Region Selection and Concurrency Model for Multi-class Weather
Classification. IEEE Trans. Image Process. 2017,26, 4154–4167. [CrossRef] [PubMed]
13.
Keys, R. Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. Speech Signal Process.
2003,29, 1153–1160. [CrossRef]
14.
Lim, S.; Kim, S.; Kim, D. Performance effect analysis for insect classification using convolutional
neural network. In Proceedings of the 2017 7th IEEE International Conference on Control System,
Computing and Engineering (ICCSCE), Penang, Malaysia, 24–26 November 2017; pp. 210–215.
Sensors 2018,18, 4169 12 of 12
15.
Yalcin, H. Vision based automatic inspection of insects in pheromone traps. In Proceedings of the International
Conference on Agro-Geoinformatics, Istanbul, Turkey, 20–24 July 2015; pp. 333–338.
16.
Pjd, W.; O’Neill, M.A.; Gaston, K.J.; Gauld, I.D. Automating insect identification: Exploring the limitations of
a prototype system. J. Appl. Entomol. 2010,123, 1–8.
17.
Mayo, M.; Watson, A.T. Automatic Species Identification of Live Moths. Knowl.-Based Syst.
2007
,20, 195–202.
[CrossRef]
18.
Ding, W.; Taylor, G. Automatic moth detection from trap images for pest management. Comput. Electron. Agric.
2016,123, 17–28. [CrossRef]
19. Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015,521, 436. [CrossRef] [PubMed]
20.
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition.
arXiv,2014; arXiv:1409.1556.
21.
Kirkland, E.J. Bilinear Interpolation. Advanced Computing in Electron Microscopy; Springer: Boston, MA, USA,
2010; pp. 261–263.
22.
Chan, R.H.; Ho, C.W.; Nikolova, M. Salt-and-pepper noise removal by median-type noise detectors
and detail-preserving regularization. IEEE Trans. Image Process. A Publ. IEEE Signal Process. Soc.
2005
,14, 1479–1485.
[CrossRef]
23.
Du, X.; Cai, Y.; Wang, S.; Zhang, L. Overview of deep learning. In Proceedings of the 2016 31st
Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China,
11–13 November 2016; pp. 159–164.
24.
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition.
Proc. IEEE 1998,86, 2278–2324. [CrossRef]
25.
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region
proposal networks. In International Conference on Neural Information Processing Systems; MIT Press:
Cambridge, MA, USA, 2015; pp. 91–99.
26.
Sommer, L.; Schumann, A.; Schuchert, T.; Beyerer, J. Multi Feature Deconvolutional Faster R-CNN for Precise
Vehicle Detection in Aerial Imagery. In Proceedings of the IEEE Winter Conference on Applications of
Computer Vision. IEEE Computer Society, Lake Tahoe, NV, USA, 12–15 March 2018; pp. 635–642.
27.
Razavian, A.S.; Azizpour, H.; Sullivan, J.; Carlsson, S. CNN features off-the-shelf: An astounding baseline for
recognition. In Proceedings of the 2014 IEEE Computer Society Conference on Computer Vision and Pattern
Recognition (CVPR 2014), Columbus, OH, USA, 24–27 June 2014.
28.
Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision
(ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448.
29.
Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Gir-shick, R.; Guadarrama, S.; Darrell, T. Caffe:
Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference
on Multimedia, Orlando, FL, USA, 3–7 November 2014; pp. 675–678.
©
2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
... Insects are a class within the arthropods and have an exoskeleton and a characteristic body structure consisting of 3 parts (head, thorax, and abdomen) and three pairs of legs, compound eyes, and a pair of antennae (Martineau et al, 2017;Xia, Chen, Wang, Zhang, & Xie, 2018;Hassan, Rahman, Htike, & Win, 2014). Arthropods show more biodiversity than other groups of living things. ...
... It was used to determine the quality of many habitats (agriculture, forest, and meadow) and the fauna richness and diversity of habitats. It is also used to determine the extent of arthropod diversity, habitat fragmentation, and degradation (Martineau et al, 2017;Xia et al, 2018;Hassan et al, 2014). ...
... Furthermore, the need for sufficient experts and technicians in some insect groups delays the insect identification stage. Consequently, alternative and accurate identification methods are required, which at least non-experts can use (Martineau et al, 2017;Xia et al, 2018). ...
Article
Full-text available
Syrphidae is essential in pollinating many flowering plants and cereals and is a family with high species diversity in the order Diptera. These family species are also used in biodiversity and conservation studies. This study proposes an image-based CNN model for easy, fast, and accurate identification of Syrphidae species. Seven hundred twenty-seven hoverfly images were used to train and test the developed deep-learning model. Four hundred seventy-nine of these images were allocated to the training set and two hundred forty-eight to the test dataset. There are a total of 15 species in the dataset. With the CNN-based deep learning model developed in this study, accuracy 0.96, precision 0.97, recall 0.96, and f-measure 0.96 values were obtained for the dataset. The experimental results showed that the proposed CNN-based deep learning model had a high success rate in distinguishing the Syrphidae species.
... The paper offers an enhanced deep learning model named Hunger Games Search-based Deep Convolutional Neural Network (HGS-DCNN) for efficient insect identification with improved accuracy to address this difficulty. The process of recognizing and classifying insects, addressing several challenges, was proposed by the authors in paper (Xia et al., 2018) locating information on an insect quickly as part of a complex backdrop, precisely recognizing insect species, especially when they are highly similar within the same species (intra-class) and across species (inter-class) and identifying differences in the appearance of the same insect species at various stages of development. These issues are crucial in the field of insect recognition and categorization. ...
... Modifying existing architectures through various mathematical or structural methods is a common practice to increase the robustness of such a system. By increasing the number of training images and fine-tuning the network's parameters, the accuracy of pest identification using NNs may be enhanced (Xia et al., 2018). This procedure is done multiple times until the system achieves a satisfactory level of accuracy (Butera et al., 2021). ...
... Harmful insect detection (Albanese et al., 2021), (Alsanea et al., 2022), (Ayan et al., 2020), (Butera et al., 2021), (Cochero et al., 2022), (Genaev et al., 2022), (Guo et al., 2021), (Hansen et al., 2019), (Hong et al., 2021), (Hossain et al., 2019), (Iost Filho et al., 2022), (Espinoza et al., 2016), (Kasinathan et al., 2021), (Khanramaki et al., 2021), (Knyshov et al., 2021), (Li et al., 2019), , , , , (Lv et al., 2022), (Malathi and Gopinath, 2021), (Nagar and Sharma, 2021), (Nanni et al., 2022), (Rajeena et al., 2022), (Rimal et al., 2022), (Rustia et al., 2020), (Sanghavi et al., 2022), (Teng et al., 2022), (Valan et al., 2019), , , (Xia et al., 2018), (Zhang & Chen, 2020), (Shi et al., 2020) Infected crops by insects (Bereciartua-Peŕez et al., 2022), (Bhoi et al., 2021), (Fang et al., 2020), (Espinoza et al., 2016), (Kusrini et al., 2021), (Nazri et al., 2018), (Sharma et al., 2020), (Singh et al., 2021), (Tian G. et al., 2020), (Turkoglu et al., 2020), (Turkoglu et al., 2022), , (Xing et al., 2019), , , Crop monitoring (Ahmad et al., 2021), (Aota et al., 2021), (Bouroubi et al., 2018), , (De Cesaro Juńior et al., 2022), (Ding & Taylor, 2016), (Dai et al., 2021), (Partel et al., 2019), (Dos Santos et al., 2022), (Takimoto et al., 2021), (Zhong et al., 2018) considerable results and note the popularity of AI in general. The applicability aspect of using these defined systems brings to the forefront a series of advantages and development areas. ...
Article
Full-text available
Modern and precision agriculture is constantly evolving, and the use of technology has become a critical factor in improving crop yields and protecting plants from harmful insects and pests. The use of neural networks is emerging as a new trend in modern agriculture that enables machines to learn and recognize patterns in data. In recent years, researchers and industry experts have been exploring the use of neural networks for detecting harmful insects and pests in crops, allowing farmers to act and mitigate damage. This paper provides an overview of new trends in modern agriculture for harmful insect and pest detection using neural networks. Using a systematic review, the benefits and challenges of this technology are highlighted, as well as various techniques being taken by researchers to improve its effectiveness. Specifically, the review focuses on the use of an ensemble of neural networks, pest databases, modern software, and innovative modified architectures for pest detection. The review is based on the analysis of multiple research papers published between 2015 and 2022, with the analysis of the new trends conducted between 2020 and 2022. The study concludes by emphasizing the significance of ongoing research and development of neural network-based pest detection systems to maintain sustainable and efficient agricultural production.
... CNN models possess a deep neural architecture comprising convolutional, pooling, and connected layers. Current agricultural studies based on deep learning provide evidence that CNN can effectively recognize diseases and pests in plant protection [11][12][13][14][15][16][17][18][19][20][21][22]. ...
... It has also been demonstrated that these models exhibit better accuracy and computational efficiency compared to other CNN models in some studies. The pre-trained CNN models most frequently employed in pest classification studies [11][12][13][14][15][16][17][18][19][20][21][22] serve as the foundation of our deep learning strategy. ...
Article
Full-text available
Artificial intelligence-based systems play a crucial role in Integrated Pest Management studies. It is important to develop and support such systems for controlling wheat pests, which cause significant losses in wheat production which is strategic importance, particularly in Turkey. This study employed various pre-trained deep learning approaches to identify key wheat pests in the Central Anatolia Region, namely Aelia spp., Anisoplia spp., Eurygaster spp., Pachytychius hordei, and Zabrus spp. The models' classification success was determined using open and original datasets. Among the models, the ResNet-18 model outperformed others, achieving a classification success rate of 99%. Furthermore, each model was tested with original images collected during field studies to assess their effectiveness. The results demonstrate that pre-trained deep learning models can be utilized for the identification of important wheat pests in Central Anatolia as part of Integrated Pest Management.
... Además, se reporta valores de 0.900 para el modelo de SVM, en la aplicación de imágenes espacio-temporales de sitios de siembra específicos de maíz (Nyéki et al., 2021). Entre otras propuestas de análisis como el Deep learning, permiten combinar características abstractas de alto nivel en los insectos, facilitando el análisis del perceptrón oculto, consintiendo al algoritmo el reconocimiento de características propias de los insectos y características morfológicas (Xia et al., 2018;Aladhadh et al., 2022). Rimal, Shah & Jha (2023) obtuvieron valores diferenciados superiores al 0,96 en training accuracy y 0,95 validation accuracy con plagas de insectos presentes en el arroz (rodillo, barrenador asiático y amarillo del arroz, gorgojo) y similares a los encontrados por Coulibaly et al. (2022) para varios métodos supervisados en insectos en contornos relevantes de los insectos. ...
Article
Full-text available
The article explores the use of convolutional neural networks, specifically ResNet-50, to detect weevils in corn kernels. Weevils are a major pest of stored maize and can cause significant yield and quality losses. The study found that the ResNet-50 model was able to distinguish with high precision between weevil-infested corn kernels and healthy kernels, achieving values of 0.9464 for precision, 0.9310 for sensitivity, 0.9630 for specificity, 0.9469 for quality index, 0.9470 for the area under the curve (AUC) and 0.9474 for the F-score. The model was able to recognize nine out of ten weevil-free corn kernels using a minimal number of training samples. These results demonstrate the efficiency of the model in the accurate detection of weevil infestation in maize grains. The model's ability to accurately identify weevil-affected grains is critical to taking rapid action to control the spread of the pest, which can prevent significant economic losses and preserve the quality of stored corn. Research suggests that the use of ResNet-50 offers an efficient and low-cost solution for the early detection of weevil infestation in corn kernels. These models can quickly process large amounts of imaging data and perform accurate analysis, making it easy to identify affected grains.
... Several algorithms for identifying insects via images have been established using machine learning methods. Accurate classification results based on neural networks have been achieved for specific families [27]- [29] and even at the species level [30], [31]. Additionally, the images also proved useful for estimating biomass [27], [31] and provide information about abundance when complete collections are imaged and evaluated. ...
Preprint
Full-text available
p>Understanding and combatting biodiversity loss are critical tasks facing our planet. They are made especially difficult because much of the earth’s biodiversity is concentrated in abundant and species-rich groups of invertebrates like insects. Traditionally, samples of insects have been analyzed manually by experts using morphology. Not only does this necessitate taxonomic expertise, but it is also error-prone, time-consuming, and often involves commercial microscopes that are too expensive for many countries in the Global South where most species are found. The alternative to expert sorting with morphology is the use of DNA barcoding. However, this respecies-richquires a well-equipped laboratory and an entirely different skill set. We present an alternative solution: a low-cost, open-source photomicroscope for taking high-resolution, focus-stacked images that can be used for insect classification: the Entomoscope. We describe two different versions of the Entomoscope, a standalone version that can be operated without additional hardware and an even simpler Version, that requires a computer. We show that the optics are of sufficiently high quality to classify specimens with >95% accuracy into 15 different types of insects (mostly ’families’ according to the Linnean classification). The classifier can be successively extended or individually trained for specific classification tasks. Here, we provide building instructions, 3D files, and a list of commercially available parts so that everyone can build their own Entomoscope. Open-source DIY hardware like the Entomoscope facilitates affordable, cutting-edge biodiversity research by entomologists around the globe.</p
... Several algorithms for identifying insects via images have been established using machine learning methods. Accurate classification results based on neural networks have been achieved for specific families [27]- [29] and even at the species level [30], [31]. Additionally, the images also proved useful for estimating biomass [27], [31] and provide information about abundance when complete collections are imaged and evaluated. ...
Preprint
Full-text available
p>Understanding and combatting biodiversity loss are critical tasks facing our planet. They are made especially difficult because much of the earth’s biodiversity is concentrated in abundant and species-rich groups of invertebrates like insects. Traditionally, samples of insects have been analyzed manually by experts using morphology. Not only does this necessitate taxonomic expertise, but it is also error-prone, time-consuming, and often involves commercial microscopes that are too expensive for many countries in the Global South where most species are found. The alternative to expert sorting with morphology is the use of DNA barcoding. However, this respecies-richquires a well-equipped laboratory and an entirely different skill set. We present an alternative solution: a low-cost, open-source photomicroscope for taking high-resolution, focus-stacked images that can be used for insect classification: the Entomoscope. We describe two different versions of the Entomoscope, a standalone version that can be operated without additional hardware and an even simpler Version, that requires a computer. We show that the optics are of sufficiently high quality to classify specimens with >95% accuracy into 15 different types of insects (mostly ’families’ according to the Linnean classification). The classifier can be successively extended or individually trained for specific classification tasks. Here, we provide building instructions, 3D files, and a list of commercially available parts so that everyone can build their own Entomoscope. Open-source DIY hardware like the Entomoscope facilitates affordable, cutting-edge biodiversity research by entomologists around the globe.</p
Conference Paper
Frog species across the world are vulnerable and in decline, despite the fact that frogs are an integral part of biological systems. Within this enchanting world, we encounter two intriguing groups: non-poisonous and poisonous frogs. This phenomenon prompts the development of an automated computer vision-based frog detection system that can distinguish between deadly and non-poisonous frogs, leading to the development of early treatment methods and a reduction in relative economic loss. In this study, we present a convolutional neural network-based technique for frog detection. The CNN model required numerous epochs to run in order to provide the best result. However, we must also consider the trade-off in convergence speed. In our exploration, we conducted experiments with different epochs. Interestingly, our findings revealed that running the model for 30 epochs yielded the highest accuracy, reaching an impressive 90.83%. Through rigorous and thorough experimentation, we evaluated confusion metrics and discovered that they yielded exceptional results.
Article
The strawberry appearance is an essential standard for judging the quality, so it is crucial to accurately identify the strawberry appearance quality for intelligent picking. This study proposed a new strawberry appearance quality detection based on unsupervised deep learning. Firstly, using deep learning (Resnet18, Resnet50, and Resnet101) to extract the strawberry image feature information. And using the t-SNE (t-distribution stochastic neighbor embedding) to reduce the feature vectors’ dimension. Finally, the unsupervised learning method (Gaussian Mixture Model) was used to cluster strawberries’ feature points. The results showed that: (1) the clustering performance based on Resnet101 was effective in 2-dimensional space, the cluster accuracy was 94.89%, and the validation accuracy was 91.79%. (2) The clustering method based on Resnet50 had good performance in the 3-dimensional space, the cluster accuracy was 96.10%, and the validation accuracy was 93.08%. (3) The accuracy of deep features plus RF (random forest) was 95.00% under limited data. Thus this method will promote intelligent picking strawberry equipment and it will overcome the supervised learning drawback that divides image datasets according to prior knowledge.
Chapter
Artificial intelligence approaches, such as computer vision, can help better understand the behavior of bees and management. However, the accurate detection and tracking of bee species in the field remain challenging for traditional methods. In this study, we compared YOLOv7 and YOLOv8, two state-of-the-art object detection models, aiming to detect and classify Jataí Brazilian native bees using a custom dataset. Also, we integrated two tracking algorithms (Tracking based on Euclidean distance and ByteTrack) with YOLOv8, yielding a mean average precision (mAP50) of 0.969 and mAP50–95 of 0.682. Additionally, we introduced an optical flow algorithm to monitor beehive entries and exits. We evaluated our approach by comparing it to human performance benchmarks for the same task with and without the aid of technology. Our findings highlight occlusions and outliers (anomalies) as the primary sources of errors in the system. We must consider a coupling of both systems in practical applications because ByteTrack counts bees with an average relative error of 11%, EuclidianTrack monitors incoming bees with 9% (21% if there are outliers), both monitor bees that leave, ByteTrack with 18% if there are outliers, and EuclidianTrack with 33% otherwise. In this way, it is possible to reduce errors of human origin.
Article
Full-text available
One of the most significant risks impacting crops is pests, which substantially decrease food production. Further, prompt and precise recognition of pests can help harvesters save damage and enhance the quality of crops by enabling them to take appropriate preventive action. The apparent resemblance between numerous kinds of pests makes examination laborious and takes time. The limitations of physical pest inspection are required to be addressed, and a novel deep-learning approach called the Faster-PestNet is proposed in this work. Descriptively, an improved Faster-RCNN approach is designed using the MobileNet as its base network and tuned on the pest samples to recognize the crop pests of various categories and given the name of Fatser-PestNet. Initially, the MobileNet is employed for extracting a distinctive set of sample attributes, later recognized by the 2-step locator of the improved Faster-RCNN model. We have accomplished a huge experimentation analysis over a complicated data sample named the IP102 and acquired an accuracy of 82.43%. Further, a local crops dataset is also collected and tested on the trained Faster-PestNet approach to show the generalization capacity of the proposed model. We have confirmed through analysis that the presented work can tackle numerous sample distortions like noise, blurring, light variations, and size alterations and can accurately locate the pest along with the associated class label on the leaf of numerous types and sizes. Both visual and stated performance values confirm the effectiveness of our model.
Article
Full-text available
Monitoring the number of insect pests is a crucial component in pheromone-based pest management systems. In this paper, we propose an automatic detection pipeline based on deep learning for identifying and counting pests in images taken inside field traps. Applied to a commercial codling moth dataset, our method shows promising performance both qualitatively and quantitatively. Compared to previous attempts at pest detection, our approach uses no pest-specific engineering which enables it to adapt to other species and environments with minimal human effort. It is amenable to implementation on parallel hardware and therefore capable of deployment in settings where real-time performance is required.
Conference Paper
Full-text available
Insects are one of the most important factors that threaten the yield efficiency in agricultural areas. Expenditures made for biological pesticides form a huge portion of the total expenses since insects massively reproduce. Observing the reproduction stages of the insects, more effective and smarter pesticizing scenarios can be achieved using biotechnical approaches such as pheromone traps rather than biological ones. Using pheromone traps, the massive reproduction is prevented since the male insects are attracted to the traps and cannot mate with the female ones. The most important disadvantage of the pheromone traps is the expensive labor cost due to the physical patrolling of the traps. Inspection of traps require expert staff who can recognize different kinds of insects. Besides the high labor costs, because of the human factor in the whole cycle, many problems occur such as errors made in counting and recording of the collected data. To overcome these problems, it is possible to integrate camera to the traps in order to lower the labor costs and assure more accurate record of the insect counts and types. Hence the visual data acquired through the traps can be inspected automatically using state of art computer vision techniques. The objective of this paper is to analyze and advance the methods that can discriminate and classify the insects in the traps under challenging illumination and environmental conditions using computer vision and machine learning algorithms. In this study, we use background subtraction and active contour models successively to separate the insects from the background and extract the outer boundary of the insects. We extract features using Hu moments (Hu), Elliptic Fourier Descriptors (EFD), Radial Distance Functions (RDF) and Local Binary Patterns (LBP). LBP features seem to outperform the rest of the features in recognition rate based on the individual performance of each method. The results from the underlying features are then fused using weighted majority voting to obtain a decision.
Article
The classification of pest species in field crops, such as corn, soybeans, wheat, and canola, is still challenging because of the tiny appearance differences among pest species. In all cases, the appearances of pest species in different poses, scales or rotations make the classification more difficult. Currently, most of the classification methods relied on hand-crafted features, such as the scale-invariant feature transform (SIFT) and the histogram of oriented gradients (HOG). In this work, the features of pest images are learned from a large amount of unlabeled image patches using unsupervised feature learning methods, while the features of the image patches are obtained by the alignment-pooling of low-level features (sparse coding), which are encoded based on a predefined dictionary. To address the misalignment issue of patch-level features, the filters in multiple scales are utilized by being coupled with several pooling granularities. The filtered patch-level features are then embedded into a multi-level classification framework. The experimental results on 40 common pest species in field crops showed that our classification model with the multi-level learning features outperforms the state-of-the-art methods of pest classification. Furthermore, some models of dictionary learning are evaluated in the proposed classification framework of pest species, and the impact of dictionary sizes and patch sizes are also discussed in the work.
Conference Paper
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [7] and Fast R-CNN [5] have reduced the running time of these detection networks, exposing region pro-posal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully-convolutional network that simultaneously predicts object bounds and objectness scores at each position. RPNs are trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. With a simple alternating optimization, RPN and Fast R-CNN can be trained to share convolu-tional features. For the very deep VGG-16 model [18], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007 (73.2% mAP) and 2012 (70.4% mAP) using 300 proposals per image. The code will be released.
Article
Towards weather condition recognition, we emphasize the importance of regional cues in this paper and address a few important problems regarding appropriate representation, its differentiation among regions, and weather-condition feature construction. Our major contribution is, first, to construct a multi-class benchmark dataset containing 65,000 images from 6 common categories for sunny, cloudy, rainy, snowy, haze and thunder weather. This dataset also benefits weather classification and attribute recognition. Second, we propose a deep learning framework named region selection and concurrency model (RSCM) to help discover regional properties and concurrency.We evaluate RSCM on our multi-class benchmark data and another public dataset for weather recognition.
Conference Paper
In recent years, deep learning has achieved great success in many fields, such as computer vision and natural language processing. Compared to traditional machine learning methods, deep learning has a strong learning ability and can make better use of datasets for feature extraction. Because of its practicability, deep learning becomes more and more popular for many researchers to do research works. In this paper, we mainly introduce some advanced neural networks of deep learning and their applications. Besides, we also discuss the limitations and prospects of deep learning.
Article
Classification of insect species of field crops such as corn, soybeans, wheat, and canola is more difficult than the generic object classification because of high appearance similarity among insect species. To improve the classification accuracy, we develop an insect recognition system using advanced multiple-task sparse representation and multiple-kernel learning (MKL) techniques. As different features of insect images contribute differently to the classification of insect species, the multiple-task sparse representation technique can combine multiple features of insect species to enhance the recognition performance. Instead of using hand-crafted descriptors, our idea of sparse-coding histograms is adopted to represent insect images so that raw features (e.g., color, shape, and texture) can be well quantified. Furthermore, the MKL method is proposed to fuse multiple features effectively. The proposed learning model can be optimized efficiently by jointly optimizing the kernel weights. Experimental results on 24 common pest species of field crops show that our proposed method performs well on the classification of insect species, and outperforms the state-of-the-art methods of the generic insect categorization.