ArticlePDF Available
TELKOMNIKA Telecommunication, Computing, Electronics and Control
Vol. 18, No. 3, June 2020, pp. 1376~1381
ISSN: 1693-6930, accredited First Grade by Kemenristekdikti, Decree No: 21/E/KPT/2018
DOI: 10.12928/TELKOMNIKA.v18i3.14840 1376
Journal homepage: http://journal.uad.ac.id/index.php/TELKOMNIKA
Convolutional neural network
for maize leaf disease image classification
Mohammad Syarief, Wahyudi Setiawan
Informatics Department, University of Trunojoyo Madura, Indonesia
Article Info
ABSTRACT
Article history:
Received Aug 17, 2019
Revised Jan 4, 2020
Accepted Feb 26, 2020
This article discusses the maize leaf disease image classification.
The experimental images consist of 200 images with 4 classes: healthy,
cercospora, common rust and northern leaf blight. There are 2 steps: feature
extraction and classification. Feature extraction obtains features automatically
using convolutional neural network (CNN). Seven CNN models were
tested i.e AlexNet, virtual geometry group (VGG) 16, VGG19, GoogleNet,
Inception-V3, residual network 50 (ResNet50) and ResNet101. While
the classification using machine learning methods include k-Nearest
neighbor, decision tree and support vector machine. Based on the testing
results, the best classification was AlexNet and support vector machine with
accuracy, sensitivity, specificity of 93.5%, 95.08%, and 93%, respectively.
Keywords:
AlexNet
Classification
Convolutional neural network
k-nearest neighbor
Maize leaf image
This is an open access article under the CC BY-SA license.
Corresponding Author:
Wahyudi Setiawan,
Informatics Department,
University of Trunojoyo Madura,
Raya Telang St., Perumahan Telang Inda, Telang, Kamal, Bangkalan, Jawa Timur 69162, Indonesia.
Email: wsetiawan@trunojoyo.ac.id
1. INTRODUCTION
Convolutional neural network (CNN) is a development of the artificial neural network that consists
of tens to hundreds of layers [1]. CNN is a method in deep learning that can perform various tasks such as
image classification [2, 3], segmentation [4, 5], recognition [6, 7], and objects detection [8, 9]. CNN
technology has grown widely including fields of medical image [10, 11], autonomous drivers [12, 13],
robotics [14, 15], and agricultural image [16]. Many image studies have been carried out, such as disease
classification in 15 food crops using 5 convolutional layers [17], classification of diseases in 9 class plant
images using googleNet [18]. Mohanty et al. classified 14 types of food crops, including maize. There was
26 class of diseases. The testing used images in vast numbers, i.e 54,306 images. Deep learning conventional
neural network with two architecture (AlexNet and GoogleNet) was used for classification.
The classification results showed an accuracy of 31.4% [19].
In this study, classification was carried out to detect diseases in maize leave images using CNN.
One of the previous studies that carried out diseases classification of maize leaves using CNN was Sibiya &
Sumbwanyambe [20]. They using 3 classes of disease classification: northern leaf blight, common rust,
and cercospora. CNN architecture used was not explained in detail, but it only mentioned using 50 hidden
layers consisting of convolution layers with filter kernels that have a median of 24, rectified linear units
(ReLU) and pooling layers. One hundred images per class was used with a ratio of 70% for training and 30%
for testing. The testing results showed an accuracy of 92.85% [20]
TELKOMNIKA Telecommun Comput El Control
Convolutional neural network for maize leaf disease image classification (Mohammad Syarief)
1377
Zhang classified 8 diseases in maize leaf images: southern leaf blight, brown spot, curvularia leaf
spot, rust, dwarf mosaic, gray leaf spot, round spot, and northern leaf blight [18]. CNN architecture used was
googleNet or Inception-V1. The experiments were conducted using 3,672 images, 80% for training and 20%
for testing. Classification results showed an accuracy of 98.9% [18]. Hidayat classified three diseases in
maize leaf images: common rust, cercospora, and northern leaf blight [21]. The experiments used 300 maize
leaf images. Average accuracy result was 93.67% [21].
Both types of research, Sibiya & Sumbwanyambe [20] and Hidayat et al. [21], only explained
the type of CNN layers, but the number of each layer type and detailed parameters were not explained, while
Zhang’s [18] used existing CNN architecture, i.e GoogleNet that consists of 177 layers. There was a novelty
in this study. First, use of 7 CNN architectures: AlexNet [22], VGG16, VGG19 [23], GoogleLet [23],
Inception-V3 [24], ResNet50 and ResNet101 [25] and machine learning classification method (kNN,
decision tree, SVM) to classify maize leaf diseases. Second, the percentage of accuracy increased while
compared to the previous study.
2. RESEARCH METHOD
The steps for classification process using CNN are shown in Figure 1. Maize leaf images as data are
divided into 2 parts: training and testing data. Furthermore, CNN is applied, the function of CNN as a feature
extraction process without determining type of feature extraction as in conventional machine learning.
The next process is classification using k-Nearest Neighbor, support machine and decision tree.
Figure 1. The research method of maize leaf disease image classification
2.1. Maize leaf image
Image data used maize leaves that size of 256x256 pixels. Data consists of 200 images which are
divided into 4 classes, 50 images per class. Experiment data obtained from Mohanty plant village [19].
Examples of image data on maize leaves are shown in Figure 2. When training and testing using CNN, image
size is adjusted to default size of each CNN architecture. Table 1 show the default input size of CNN model.
Figure 2. (a) Normal, (b) Cercospora, (c) Northern leaf blight, (d) Common rust
ISSN: 1693-6930
TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 3, June 2020: 1376 - 1381
Table 1. Default input size of CNN
CNN
Default Input Size
AlexNet
227×227
VGG16
224×224
VGG19
224×224
GoogleNet
224×224
Inception-V3
299×299
ResNet50
224×224
ResNet101
224×224
2.2. Convolutional neural network
CNN consists of 2 main parts: feature extraction and classification. The feature extraction section
includes input layer, convolutional layer with stride and padding, rectified linear unit (ReLU), pooling,
and batch normalization layer. While the classification part consists of fully connected layer, softmax dan
output layer. CNN architecture can have more than one type of layer [26]. CNN architectures analyzed in this
paper were AlexNet, VGG16, VGG19, GoogleNet, Inception-V3, ResNet50, and ResNet101. Those
architectures have 25, 41, 177 and 144 layers, respectively. Figure 3 shows a simple CNN model that has
13 layers: 1 input layer, 3 convolutional layers with stride and padding, 3 ReLU layers, pooling layer, 2 normalization
layer, FCL, softmax, and output layer.
Figure 3. Simple CNN model
AlexNet architecture has twenty-five layers [22]: Input layer, 5 convolutional layers, first
convolutional layer has a 11×11 filter, second layer has a 5×5 filter, and third, to fifth layer have 3×3 filters.
Furthermore, 7 ReLU layers, 2 normalization layers, 3 max-pooling layers, 3 fully connected layer,
2 dropouts 0.5, Softmax and output layer. Visual Geometry Group (VGG) from Oxford University creates a
VGG16 network architecture with 41 layers. VGG simplifies the processes by creating a 3×3 filter for each
layer. Equivalent and smaller filter size used in VGG can produce more complex features and lower
computing than AlexNet’s. VGG16 architecture consists of [23]: the input layer size is 224×224 pixels.
13 convolutional layers. First and second convolutional layers have filter size of 64 pixels, third and fourth
have filter size of 128 pixels, fifth to seventh have filter size of 256 pixels and eight to thirteenth have filter
size of 512. Fifteen ReLU. 5 max-pooling, 3 fully connected layers. Two dropout 0.5, Softmax and
output layer
While VGG19 architecture consists of [23]: input layer is 224×224 pixels. Sixteen convolutional
layers. First and second convolutional layers have filter size of 64 pixels, third and fourth have filter size of
TELKOMNIKA Telecommun Comput El Control
Convolutional neural network for maize leaf disease image classification (Mohammad Syarief)
1379
128 pixels, 5 to 8 have filter size of 256 pixels and the 9 to 16 have filter size of 512 pixels, 18 ReLU,
5 max-pooling, 3 fully connected layers, 2 dropouts of 0.5 Softmax and output layer. ResNet50 dan
ResNet101 increasing number of layers is directly proportional to the increase in learning, but it can lead to
learning more and more difficult and accuracy decreases. Residual learning provides solutions to these
problems. Residual Network (ResNet) is a CNN network architecture for residual learning. Residual learning
skips layer connection. ResNet50 architecture has 177 layers, while ResNet101 has 347 layers [26].
GoogLeNet (Inception-V1) is a CNN architecture that has 144 layers. GoogLeNet corrects
deficiencies in VGG that require high computing, both memory and time. The working principle of Inception
is that the network will automatically choose the best convolution results using a certain size. Filter size used
in this architecture is 1×1 pixels, 3×3 pixels, 5×5 pixels and max-pooling 3×3 pixels. Another variant used in
this study was Inception-V3. Inception-V3 architecture consists of 316 layers [27-29]
2.3. Classification methods
In this study, we used three classification methods for testing: support vector machine [30],
k-Nearest Neighbor [31] and decision tree [32]. For each testing, we configure the network layer,
extract the features, and make classification using each method above.
3. RESULT AND DISCUSSION
The experiment divided into 3 scenarios, output of 7 CNN models classified with SVM, kNN and
Decision Tree. Testing results using the CNN architecture were found in Table 2 to Table 4. Table 2
represented classification testing using the SVM method, while Table 3 and Table 4 for classification using
k-Nearest Neighbor and decision tree methods, respectively.
Table 2. Testing results using SVM
CNN model
Sensitivity
(%)
Specificity
(%)
Accuracy
(%)
AlexNet
95.83
100
95
Vgg16
88.4
92.03
88.3
Vgg19
88.4
92.03
88.3
ResNet50
87.28
90.63
86.7
ResNet101
90
92.85
90
GoogleNet
83.75
89.13
83.3
Inception-V3
87.55
91.32
86.7
Average
88.74
92.57
88.33
Table 3. Testing results using kNN
CNN model
Sensitivity
(%)
Specificity
(%)
Accuracy
(%)
AlexNet
94.72
94.715
93.3
Vgg16
82.23
89.65
76.7
Vgg19
88.4
91.94
88.3
ResNet50
73.68
87.2
75
ResNet101
81.98
88.89
80
GoogleNet
84.55
89.41
83.3
Inception-V3
80.98
88.33
80
Average
83.79
90.02
82.37
Table 4. Testing results using decision tree
CNN model
Sensitivity (%)
Specificity (%)
Accuracy (%)
AlexNet
74.33
84.19
73.3
Vgg16
77.73
84.58
75
Vgg19
76.93
86.89
76.7
ResNet50
83.58
88.22
83.3
ResNet101
76
83.7
73.3
GoogleNet
73.85
85.61
75
Inception-V3
66.53
79.93
65
Average
75.56
84.73
74.51
Based on the testing results above, the best classification was produced by AlexNet architecture with
Support Vector Machine classification. It showed the best performance measures based on sensitivity,
specificity, and accuracy of 95.83%, 100%, and 95%, respectively. Best average accuracy of 88.33%
using SVM. Furthermore, to do validation used 10-fold cross-validation. This method will divide data into
10 equal parts. The complete stages are as follows:
a) First, 9 data sections are used for training, one final data section is used for testing.
b) Second, the data section until the data is used for training, the first data section is used for testing,
and so on.
c) And so on until the data part 1 to 8 and 10 are used as training data, while data section 9 is used as
testing data.
d) Find the average value of all rounds. For more details, an illustration of 10-fold cross-validation is
shown in Figure 4.
ISSN: 1693-6930
TELKOMNIKA Telecommun Comput El Control, Vol. 18, No. 3, June 2020: 1376 - 1381
The results of 10-fold cross-validation are shown in Table 5. It using AlexNet and SVM as classification. In
the final section, the results of 10 k cross-validation were compared with previous studies. Table 6 represents
a comparison between this study and previous studies using maize leaf images for
disease classification.
Figure 4. 10-fold cross-validation
Table 5. Performance measure of 10-fold cross-validation
Round
Sensitivity (%)
Specificity (%)
Accuracy (%)
1
95.83
95.83
95
2
90
92.86
90
3
100
100
100
4
87.5
88.69
85
5
100
100
100
6
95.83
95.83
95
7
100
100
100
8
80
88.89
80
9
90
92.86
90
10
95.83
95.83
95
Average
93.5
95.08
93
Table 6. Results comparison of maize leaf classification
Authors
Number of classes
Sensitivity (%)
Specificity (%)
Accuracy (%)
Sibiya & Sumbwanyambe [20]
3
-
-
92.85
Zhang et al. [18]
8
-
-
98.9
Hidayat et al. [21]
3
-
-
93.67
Proposed method
4
93.5
95.08
93
4. CONCLUSION
This study analyzed maize leaf image classification using 7 CNN architectures (AlexNet, VGG16,
VGG19, ResNet50, ResNet110. GoogleNet, and Inception-V3) and the classification methods (SVM, kNN,
and Decision Tree). The best classification was generated by AlexNet architecture with SVM. This showed
that AlexNet and SVM methods were best suited for feature extraction and image classification of maize
leaves disease. Furthermore, we could increase the percentage of accuracy by adding optimization methods in
CNN architectures.
REFERENCES
[1] C. Steger, M. Ulrich, and C. Wiedemann, "Machine Vision Algorithms and Applications," 1st Edition. Weinheim:
Wiley-VCH, 2007.
[2] D. Hana, Q. Liu, and W. Fan, “A new image classification method using CNN transfer learning and web data
augmentation,” Expert Syst. Appl., vol. 95, pp. 43-56, 2018.
[3] C. Zhang et al., “A hybrid MLP-CNN classifier for very fine resolution remotely sensed image classification,”
ISPRS J. Photogramm. Remote Sens., vol. 140, pp. 133-144, 2018.
TELKOMNIKA Telecommun Comput El Control
Convolutional neural network for maize leaf disease image classification (Mohammad Syarief)
1381
[4] W. Setiawan, M. I. Utoyo, and R. Rulaningtyas, “Vessels semantic segmentation with gradient descent
optimization,” International Academic Journals., vol. 7, no. 4, pp. 4062-4067, 2018.
[5] W. Setiawan, M. I. Utoyo, and R. Rulaningtyas, “Semantic segmentation of artery-venous retinal vessel using
simple convolutional neural network,” IOP Conference Series : Earth and Environmental Science, vol. 243, no. 1,
pp. 1-10, 2019.
[6] Y. Li, J. Zeng, and S. Shan, “Occlusion aware facial expression recognition using CNN with attention mechanism,”
IEEE Trans. Image Process., vol. 28, no. 5, pp. 2439-2450, 2018.
[7] Y. X, Yang, et al., “Face recognition using the SR-CNN Model,” Sensors, vol. 18, no. 12, 2018.
[8] Y. Chen, et al., “Domain adaptive faster R-CNN for object detection in the wild,” pp. 3339-3348, 2018.
[9] H. Gao, et al., “Object classification using CNN-based fusion of vision and LIDAR in autonomous vehicle
environment,” IEEE Trans. Ind. Informatics, vol. 14, no. 9, pp. 4224-4231, 2018.
[10] A. Khatami, et al., “A sequential search-space shrinking using CNN transfer learning and a Radon projection pool
for medical image retrieval,” Expert Syst. Appl., vol. 100, pp. 224-233, 2018.
[11] M. Frid-adar, et al., “GAN-based synthetic medical image augmentation for increased CNN performance in liver
lesion classification,” Neurocomputing, vol. 321, pp. 321-331, 2018.
[12] S. O. A. Chishti, et al., “Self-driving cars using CNN and Q-learning,” IEEE 21st International Multi-Topic
Conference (INMIC), pp. 1-7, 2018.
[13] A. Dhall, D. Dai, and L. Van Gool, “Real-time 3D Traffic cone detection for autonomous driving,” pp. 494-501, 2019.
[14] K. Mott, “State classification of cooking objects using a VGG CNN,” arXiv preprint arXiv:1904.12613, 2018.
[15] A. Milioto and C. Stachniss, “Bonnet : An open-source training and deployment framework for semantic
segmentation in robotics using CNNs,” International Conference on Robotics and Automation, pp. 7094-7100, 2019.
[16] A. Kamilaris and F. X. Prenafeta-Boldu, “A review of the use of convolutional neural networks in agriculture,”
J. Agric. Sci., vol. 156, no. 3, pp. 312-322, 2018.
[17] S. Sladojevic, et al., “Deep neural networks based recognition of plant diseases by leaf image classification,”
Comput. Intell. Neurosci., vol. 2016, pp. 1-11, 2016.
[18] X. Zhang, et al., “Identification of Maize leaf diseases using improved deep convolutional neural networks,”
IEEE Access, vol. 6, pp. 30370-30377, 2018.
[19] S. P. Mohanty, D. Hughes, and M. Salathé, “Using deep learning for image-based plant disease detection,”
Front. Plant Sci., vol. 7, no. 10, pp. 1-7, 2016.
[20] M. Sibiya and M. Sumbwanyambe, “A computational procedure for the recognition and classification of maize leaf
diseases out of healthy leaves using convolutional neural networks,” Agri Engineering, vol. 1, no. 1, pp. 119-131, 2019.
[21] A. Hidayat, U, Darusalam, and I, Irmawati., “Detection of disease on corn plants using convolutional neural,”
Jurnal Ilmu Komputer dan Informasi., vol. 12, no. 1, pp. 51-56, 2019.
[22] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,”
Advances in neural information processing systems, pp. 1097-1105, 2012.
[23] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv
preprint arXiv:1409.1556, 2014.
[24] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image steganalysis,” Multimed. Tools Appl.,
pp. 770-778, 2015.
[25] K. He and J. Sun, “Deep residual learning for image recognition,” IEEE Xplore, pp. 1-9, 2015.
[26] W. Setiawan, M. I, Utoyo, and R. Rulaningtyas, “Classification of neovascularization using convolutional neural
network model,” TELKOMNIKA Telecommunication Computing Electronics and Control, vol. 17, no. 1,
pp. 463-472, 2019.
[27] C. Szegedy, V. Vanhouke, S. Ioffe, and J. Shlens, “Rethinking the inception architecture for computer vision
christian,” IEEE Explor., pp. 2818-2826, 2016
[28] C. Szegedy, et al., “Inception-v4, Inception-ResNet and the impact of residual connections on learning,”
Thirty-first AAAI conference on artificial intelligence, pp. 1-12, 2017.
[29] C. Szegedy et al., “Going deeper with convolutions,” IEEE conference on computer vision and pattern recognition,
pp. 1-9, 2015.
[30] A. Kowalczyk, "Support vector machine," Morrisville, North Carolina: syncfusion, 2017.
[31] A. S. Prasath, et al., “Distance and similarity measures effect on the performance of K-Nearest neighbor classifier,”
Elsevier, pp. 1-53, 2019.
[32] J. M. Martínez-Otzeta, et al., “K Nearest neighbor edition to guide classification tree learning: Motivation and
experimental results,” Lecture Notes in Computer Science, 2018, pp. 53-63, 2018.
... Numerous researchers diligently pursued the development of automated frameworks for crop leaf disease detection, with numerous TL architectures pretrained on the foundational ImageNet dataset [ 5,[7][8][9]. While these frameworks demonstrated impressive effectiveness, they are not immune to limitations, suggesting considerable room for improvement. ...
... Nonetheless, limitations emerged due to the lack of regularization techniques, potentially impacting adaptability to new data scenarios. Utilizing varied TL for feature extraction and ML for corn leaf disease classification, Syarief et al. [ 7] achieved notable results, including 93.5% accuracy via AlexNet with support vector machine (SVM) classification; limitations arose from a lack of custom fine-tuning layers and model generalizability. ...
... 678-683;Rachmad et al., 2022). Deep-learning architectures are implemented to extract deep features if the author has sufficient computational resources and time (Dash et al., 2023;Nivethithaa & Vijayalakshmi, 2023;Syarief & Setiawan, 2020). From the literature, it can be concluded that: AlexNet with SVM is considered the best performer, while Deep Forest outperforms other ML algorithms. ...
... CNN is a type of neural network model that uses several large network layers [27]. CNN has gained popularity in various image processing applications, including object recognition [28] and picture [29], and has produced promising results. CNN's are increasingly used in various imageprocessing tasks, such as object classification, face recognition, and gesture identification. ...
Article
Full-text available
A popular technique for retrieving images from huge and unlabeled image databases are content-based-image-retrieval (CBIR). However, the traditional information retrieval techniques do not satisfy users in terms of time consumption and accuracy. Additionally, the number of images accessible to users are growing due to web development and transmission networks. As the result, huge digital image creation occurs in many places. Therefore, quick access to these huge image databases and retrieving images like a query image from these huge image collections provides significant challenges and the need for an effective technique. Feature extraction and similarity measurement are important for the performance of a CBIR technique. This work proposes a simple but efficient deep-learning framework based on convolutional-neural networks (CNN) for the feature extraction phase in CBIR. The proposed CNN aims to reduce the semantic gap between low-level and high-level features. The similarity measurements are used to compute the distance between the query and database image features. When retrieving the first 10 pictures, an experiment on the Corel-1K dataset showed that the average precision was 0.88 with Euclidean distance, which was a big step up from the state-of-the-art approaches. Keywords: Content-based image retrieval Convolution neural network Deep learning This is an open access article under the CC BY-SA license.
Article
Full-text available
Agriculture sector is faced with perennial challenges that threaten both its productivity and sustainability. Among the greatest threats to cereal crops is disease, especially on cereal grains such as maize. Maize is an important grain that has grown globally, yet it often falls prey to maize leaf disease, a destructive and prevalent disorder. The consequences of these diseases go beyond individual farmers; reduced yields destabilize supply chains, market stability, and global efforts towards creating sustainable food systems. The prevalence of leaf diseases adversely affects crop productivity, which directly impacts the objective of sustainable agriculture. In order to address this problem, technology, more specifically artificial intelligence, has been a game-changer. Adopting cutting-edge research for maize disease detection not only raises diagnostic accuracy but also supports sustainable agricultural practices. This approach encourages effective input use, supports food security, minimizes environmental degradation, and provides farmers with accurate tools, all of which contribute to long-term agricultural resilience and sustainability. However, the incorporation of AI in farming is confronted by a number of challenges. Numerous hindrances hinder precise detection and categorization of maize leaf diseases via artificial intelligence methods. In tackling the limitations, the current study introduces an AI-based approach. It applied Multi-scaled Xception pre-trained models to extract deep features from images. The models were fine-tuned with varying weights for advancing the feature extraction so as to enhance the likelihood of correct visual classification. In addition to its strong accuracy, the research provides a formal and strict mathematical formulation, and different optimization methods further establish the effectiveness of the model. Additionally, the examination of fusion operators helps to improve the interpretability of the model. The expected model was tested with a confidence interval, showing that its performance remains within set limits.
Conference Paper
Full-text available
In this paper, with the use of transfer learning on mobile platforms, a new approach for Tomato Leaf Disease Detection (TLDD) was accomplished utilising convolutional neural networks (CNNs). Our main goal was to create a model that would enable real-time mobile diagnostics by quickly and correctly identifying diseases of tomato leaves. The base CNN architecture used was fine-tuned on a carefully curated dataset of healthy and diseased tomato leaf images. Experimental results demonstrate outstanding performance in disease detection. The precision score achieved an impressive 100%, the recall value attained 96%, signifying a high proportion of true positives being identified. At a remarkable 97.96%, the F1 Score—which balances recall and precision—reached its peak. The deployment of TLDD system on mobile devices promotes decentralized decision-making and enhances efficiency of disease management strategies. This paper introduces a successful application of digital image classification on a mobile application platform. Our approach contributes to improving agricultural practices and ensuring food security for a growing population.
Conference Paper
Full-text available
The ability to interpret a scene is an important capability for a robot that is supposed to interact with its environment. The knowledge of what is in front of the robot is, for example, relevant for navigation, manipulation, or planning. Semantic segmentation labels each pixel of an image with a class label and thus provides a detailed semantic annotation of the surroundings to the robot. Convolutional neural networks (CNNs) are popular methods for addressing this type of problem. The available software for training and the integration of CNNs for real robots, however, is quite fragmented and often difficult to use for non-experts, despite the availability of several high-quality open-source frameworks for neural network implementation and training. In this paper, we propose a tool called Bonnet, which addresses this fragmentation problem by building a higher abstraction that is specific for the semantic segmentation task. It provides a modular approach to simplify the training of a semantic segmentation CNN independently of the used dataset and the intended task. Furthermore, we also address the deployment on a real robotic platform. Thus, we do not propose a new CNN approach in this paper. Instead, we provide a stable and easy-to-use tool to make this technology more approachable in the context of autonomous systems. In this sense, we aim at closing a gap between computer vision research and its use in robotics research. We provide an opensource codebase for training and deployment. The training interface is implemented in Python using TensorFlow and the deployment interface provides a C++ library that can be easily integrated in an existing robotics codebase, a ROS node, and two standalone applications for label prediction in images and videos.
Article
Full-text available
Semantic segmentation is how to categorize objects in an image based on pixel color intensity. There is an implementation in the medical imaging. This article discusses semantic segmentation in retinal blood vessels. Retinal blood vessels consist of artery and vein. Arteryvenous segmentation is needed to detect diabetic retinopathy, hypertension, and artherosclerosis. The data for the experiment is Retinal Image vessel Tree Extraction (RITE). Data consists of 20 patches with a dimension of 128 × 128 × 3. The process for performing semantic segmentation consists of 3 method, create a Conventional Neural Network (CNN) model, pre-trained network, and training the network. The CNN model consists of 10 layers, 1 input layer image, 3 convolution layers, 2 Rectified Linear Units (ReLU), 1 Max pooling, 1 transposed convolution layer, 1 softmax and 1 pixel classification layer. The pre-trained network uses the optimization algorithm Stochastic Gradient Descent with Momentum (SGDM), Root Mean Square Propagation (RMSProp) and Adaptive Moment optimization (Adam). Various scenarios were tested to get optimal accuracy. The learning rate is 1e-3 and 1e-2. Minibatch size are 4,8,16,32,64, and 128. The maximum value of epoch is set to 100. The results show the highest accuracy of up to 98.35%
Article
Full-text available
Plant leaf diseases can affect plant leaves to a certain extent that the plants can collapse and die completely. These diseases may drastically decrease the supply of vegetables and fruits to the market, and result in a low agricultural economy. In the literature, different laboratory methods of plant leaf disease detection have been used. These methods were time consuming and could not cover large areas for the detection of leaf diseases. This study infiltrates through the facilitated principles of the convolutional neural network (CNN) in order to model a network for image recognition and classification of these diseases. Neuroph was used to perform the training of a CNN network that recognised and classified images of the maize leaf diseases that were collected by use of a smart phone camera. A novel way of training and methodology was used to expedite a quick and easy implementation of the system in practice. The developed model was able to recognise three different types of maize leaf diseases out of healthy leaves. The northern corn leaf blight (Exserohilum), common rust (Puccinia sorghi) and gray leaf spot (Cercospora) diseases were chosen for this study as they affect most parts of Southern Africa’s maize fields.
Article
Full-text available
Neovascularization is a new vessel in the retina beside the artery-venous. Neovascularization can appear on the optic disk and the entire surface of the retina. The retina categorized in Proliferative Diabetic Retinopathy (PDR) if it has neovascularization. PDR is a severe Diabetic Retinopathy (DR). An image classification system between normal and neovascularization is here presented. The classification using Convolutional Neural Network (CNN) model and classification method such as Support Vector Machine, k-Nearest Neighbor, Naïve Bayes classifier, Discriminant Analysis, and Decision Tree. By far, there are no data patches of neovascularization for the process of classification. Data consist of normal, New Vessel on the Disc (NVD) and New Vessel Elsewhere (NVE). Images are taken from 2 databases, MESSIDOR and Retina Image Bank. The patches are made from a manual crop on the image that has been marked by experts as neovascularization. The dataset consists of 100 data patches. The test results using three scenarios obtained a classification accuracy of 90%-100% with linear loss cross validation 0%-26.67%. The test performs using a single Graphical Processing Unit (GPU).
Article
Full-text available
Deep learning (DL) constitutes a modern technique for image processing, with large potential. Having been successfully applied in various areas, it has recently also entered the domain of agriculture. In the current paper, a survey was conducted of research efforts that employ convolutional neural networks (CNN), which constitute a specific class of DL, applied to various agricultural and food production challenges. The paper examines agricultural problems under study, models employed, sources of data used and the overall precision achieved according to the performance metrics used by the authors. Convolutional neural networks are compared with other existing techniques, and the advantages and disadvantages of using CNN in agriculture are listed. Moreover, the future potential of this technique is discussed, together with the authors’ personal experiences after employing CNN to approximate a problem of identifying missing vegetation from a sugar cane plantation in Costa Rica. The overall findings indicate that CNN constitutes a promising technique with high performance in terms of precision and classification accuracy, outperforming existing commonly used image-processing techniques. However, the success of each CNN model is highly dependent on the quality of the data set used.
Article
Deep Learning is still an interesting issue and is still widely studied. In this study Deep Learning was used for the diagnosis of corn plant disease using the Convolutional Neural Network (CNN) method, with a total dataset of 3.854 images of diseases in corn plants, which consisted of three types of corn diseases namely Common Rust, Gray Leaf Spot, and Northern Leaf Blight. With an accuracy of 99%, in detecting disease in corn plants.
Conference Paper
DrivingMatter is an experiment carried out to understand the deeper side of an autonomous car. In 1900s, idea was to drive car on Moon from Earth. This was initial motivation which grew from there and now expanding to complex system of roads in the real world. A book-sized Raspberry Pi based autonomous car is built to carry out the experiment on hardware. Software side was accomplished by developing a Python based library for controlling and communicating with car over a network or locally within the car. For environment learning two methodologies are practiced; Supervised learning: Drove the car on an environment/road and collected 3, 000+ data-points. Based on this a CNN model was trained which achieved 73 % test 89 % train accuracy. Reinforcement learning: Car is trained for three different road signs; Stop, No left, and Traffic light using DQN with existing CNN model. These road signs are detected in the environment using OpenCV cascade classifiers.
Article
Facial expression recognition in the wild is challenging due to various un-constrained conditions. Although existing facial expression classifiers have been almost perfect on analyzing constrained frontal faces, they fail to perform well on partially occluded faces that are common in the wild. In this paper, we propose a Convolution Neutral Network with attention mechanism (ACNN) that can perceive the occlusion regions of the face and focus on the most discriminative unoccluded regions. ACNN is an end to end learning framework. It combines the multiple representations from facial regions of interest (ROIs). Each representation is weighed via a proposed Gate Unit that computes an adaptive weight from the region itself according to the unobstructed-ness and importance. Considering different RoIs, we introduce two versions of ACNN: patch based ACNN (pACNN) and global-local based ACNN (gACNN). pACNN only pays attention to local facial patches. gACNN integrates local representations at patch-level with global representation at image-level. The proposed ACNNs are evaluated on both real and synthetic occlusions, including a self-collected facial expression dataset with real-world occlusions (FED-RO), two largest in-the-wild facial expression datasets (RAF-DB and AffectNet) and their modifications with synthesized facial occlusions. Experimental results show that ACNNs improve the recognition accuracy on both the non-occluded faces and occluded faces. Visualization results demonstrate that, compared with the CNN without Gate Unit, ACNNs are capable of shifting the attention from the occluded patches to other related but unobstructed ones. ACNNs also outperform other state-of-the-art methods on several widely used in-the-lab facial expression datasets under the cross-dataset evaluation protocol.