Conference PaperPDF Available

Plant Disease Detection Using Machine Learning

Authors:
Plant Disease Detection Using Machine Learning
Shima Ramesh
Assistant Professor: department of electronics and
communication,
MVJ college of Engineering.
Bangalore, India
Niveditha M, Pooja R, Prasad Bhat N, Shashank N
Research Scholar: department of electronics and
communication,
MVJ college of Engineering,
Bangalore, India
Mr. Ramachandra Hebbar,
Senior Scientist, ISRO, RRSC-S,
Marathalli, Bangalore, India,
hebbar4@gmail.com
Mr. P V Vinod
Scientist, ISRO, RRSC-S,
Marathalli, Bangalore, India
ramasubramoniams@gmail.com
AbstractCrop diseases are a noteworthy risk to sustenance
security, however their quick distinguishing proof stays
troublesome in numerous parts of the world because of the non
attendance of the important foundation. Emergence of accurate
techniques in the field of leaf-based image classification has
shown impressive results. This paper makes use of Random
Forest in identifying between healthy and diseased leaf from the
data sets created. Our proposed paper includes various phases of
implementation namely dataset creation, feature extraction,
training the classifier and classification. The created datasets of
diseased and healthy leaves are collectively trained under
Random Forest to classify the diseased and healthy images. For
extracting features of an image we use Histogram of an Oriented
Gradient (HOG). Overall, using machine learning to train the
large data sets available publicly gives us a clear way to detect the
disease present in plants in a colossal scale.
KeywordsDiseased and Healthy leaf, Random forest, Feature
extraction, Training, Classification.
I. INTRODUCTION
The agriculturist in provincial regions may think that it’s
hard to differentiate the malady which may be available in
their harvests. It's not moderate for them to go to agribusiness
office and discover what the infection may be. Our principle
objective is to distinguish the illness introduce in a plant by
watching its morphology by picture handling and machine
learning.
Pests and Diseases results in the destruction of crops or part
of the plant resulting in decreased food production leading to
food insecurity. Also, knowledge about the pest management
or control and diseases are less in various less developed
countries. Toxic pathogens, poor disease control, drastic
climate changes are one of the key factors which arises in
dwindled food production.
Various modern technologies have emerged to minimize
postharvest processing, to fortify agricultural sustainability
and to maximize the productivity. Various Laboratory based
approaches such as polymerase chain reaction, gas
chromatography, mass spectrometry, thermography and hyper
spectral techniques have been employed for disease
identification. However, these techniques are not cost effective
and are high time consuming.
In recent times, server based and mobile based approach
for disease identification has been employed for disease
identification. Several factors of these technologies being high
resolution camera, high performance processing and extensive
built in accessories are the added advantages resulting in
automatic disease recognition.
Modern approaches such as machine learning and deep
learning algorithm has been employed to increase the
recognition rate and the accuracy of the results. Various
researches have taken place under the field of machine
learning for plant disease detection and diagnosis, such
traditional machine learning approach being random forest,
artificial neural network, support vector machine(SVM),
fuzzy logic, K-means method, Convolutional neural networks
etc.…
Random forests are as a whole, learning method for
classification, regression and other tasks that operate by
constructing a forest of the decision trees during the training
time. Unlike decision trees, Random forets overcome the
disadvantage of over fitting of their training data set and it
handles both numeric and categorical data.
The histogram of oriented gradients (HOG) is an element
descriptor utilized as a part of PC vision and image processing
for the sake of object detection. Here we are making
utilization of three component descriptors:
1. Hu moments
2. Haralick texture
3. Color Histogram
Hu moments is basically used to extract the shape of the
leaves. Haralick texture is used to get the texture of the leaves
and color Histogram is used to represent the distribution of the
colors in an image.
41
2018 International Conference on Design Innovations for 3Cs Compute Communicate Control
978-1-5386-7523-6/18/$31.00 ©2018 IEEE
DOI 10.1109/ICDI3C.2018.00017
II. LITERATURE REVIEW
[1] S. S. Sannakki and V. S. Rajpurohit, proposed a
“Classification of Pomegranate Diseases Based on Back
Propagation Neural Network” which mainly works on the
method of Segment the defected area and color and texture are
used as the features. Here they used neural network classifier
for the classification. The main advantage is it Converts to
L*a*b to extract chromaticity layers of the image and
Categorisation is found to be 97.30% accurate. The main
disadvantage is that it is used only for the limited crops.
[2] P. R. Rothe and R. V. Kshirsagar introduced a” Cotton
Leaf Disease Identification using Pattern Recognition
Techniques” which Uses snake segmentation, here Hu’s
moments are used as distinctive attribute. Active contour
model used to limit the vitality inside the infection spot,
BPNN classifier tackles the numerous class problems. The
average classification is found to be 85.52%.
[3] Aakanksha Rastogi, Ritika Arora and Shanu Sharma,” Leaf
Disease Detection and Grading using Computer Vision
Technology &Fuzzy Logic”. K-means clustering used to
segment the defected area; GLCM is used for the extraction of
texture features, Fuzzy logic is used for disease grading. They
used artificial neural network (ANN) as a classifier which
mainly helps to check the severity of the diseased leaf.
[4] Godliver Owomugisha, John A. Quinn, Ernest Mwebaze
and James Lwasa, proposed” Automated Vision-Based
Diagnosis of Banana Bacterial Wilt Disease and Black
Sigatoka Disease “Color histograms are extracted and
transformed from RGB to HSV, RGB to L*a*b.Peak
components are used to create max tree, five shape attributes
are used and area under the curve analysis is used for
classification. They used nearest neighbors, Decision tree,
random forest, extremely randomized tree, Naïve bayes and
SV classifier. In seven classifiers extremely, randomized trees
yield a very high score, provide real time information provide
flexibility to the application.
[5] uan Tian, Chunjiang Zhao, Shenglian Lu and Xinyu Guo,”
SVM-based Multiple Classifier System for Recognition of
Wheat Leaf Diseases,” Color features are represented in RGB
to HIS, by using GLCM, seven invariant moment are taken as
shape parameter. They used SVM classifier which has MCS,
used for detecting disease in wheat plant offline.
III. PROPOSED METHODOLOGY
To find out whether the leaf is diseased or healthy, certain
steps must be followed. i.e., Preprocessing, Feature extraction,
Training of classifier and Classification. Preprocessing of
image, is bringing all the images size to a reduced uniform
size. Then comes extracting features of a preprocessed image
which is done with the help of HOG . HoG [6] is a feature
descriptor used for object detection. In this feature descriptor
the appearance of the object and the outline of the image is
described by its intensity gradients. One of the advantage of
HoG feature extraction is that it operates on the cells created.
Any transformations doesn’t affect this.
Here we made use of three feature descriptors.
Hu moments: Image moments which have the important
characteristics of the image pixels helps in describing the
objects. Here Hu moments help in describing the outline of a
particular leaf. Hu moments are calculated over single channel
only. The first step involves converting RGB to Gray scale
and then the Hu moments are calculated. This step gives an
array of shape descriptors.
Haralick Texture: Usually the healthy leaves and diseased
leaves have different textures. Here we use Haralick texture
feature to distinguish between the textures of healthy and
diseased leaf. It is based on the adjacency matrix which stores
the position of (I,J). Texture [7] is calculated based on the
frequency of the pixel I occupying the position next to pixel J.
To calculate Haralick texture it is required that the image be
converted to gray scale.
Fig.1. RGB to Gray scale conversion of a leaf.
Color Histogram: Color histogram gives the representation of
the colors in the image. RGB is first converted to HSV color
space and the histogram is calculated for the same. It is needed
to convert the RGB image to HSV since HSV model aligns
closely with how human eye discerns the colors in an image.
Histogram plot [8] provides the description about the number
of pixels available in the given color ranges
42
Fig.2. RGB to HSV conversion of leaf
Fig.3. Histogram plot for healthy and diseased leaf.
IV. ALGORITHM DESCRIPTION
The algorithm here is implemented using random forests
classifier. They are flexible in nature and can be used for both
classification and regression techniques. Compared to other
machine learning techniques like SVM, Gaussian Naïve bayes,
logistic regression, linear discriminant analysis, Random
forests gave more accuracy with less number of image data
set. The following figure shows the architecture of our
proposed algorithm.
Fig.4. Architecture of the proposed model
Fig.5. Flow chart for training.
43
Fig.6. Flow chart for classification
The labeled datasets are segregated into training and testing
data. The feature vector is generated for the training dataset
using HoG feature extraction. The generated feature vector is
trained under a Random forest classifier. Further the feature
vector for the testing data generated through HoG feature
extraction is given to the trained classifier for prediction as
referred to in “Fig.4.
As shown in the ‘Fig.5.” labeled training datasets are
converted into their respective feature vectors by HoG feature
extraction. These extracted feature vectors are saved under the
training datasets. Further the trained feature vectors are trained
under Random forest classifier [9, 10].
As depicted in “Fig.6.” the feature vectors are extracted for
the test image using HoG feature extraction. These generated
feature vectors are given to the saved and trained classifier for
predicting the results.
V. RESULT
First for any image we need to convert RGB image into gray
scale image. This is done just because Hu moments shape
descriptor and Haralick features can be calculated over single
channel only. Therefore, it is necessary to convert RGB to
gray scale before computing Hu moments and Haralick
features. As depicted in the figure 4.
To calculate histogram the image first must be converted to
HSV (hue, saturation and value), so we are converting RGB
image to an HSV image as shown the figure5.
Finally, the main aim of our project is to detect whether it is
diseased or healthy leaf with the help of a Random forest
classifier which is as depicted in the “Fig.7.”
Fig.7. Final output of the classifier.
Fig.8. Comparison between different machine learning models.
TABLE I.
44
\
Fig .9. Table showing the comparison.
conclusion
The objective of this algorithm is to recognize abnormalities
that occur on plants in their greenhouses or natural
environment. The image captured is usually taken with a plain
background to eliminate occlusion. The algorithm was
contrasted with other machine learning models for accuracy.
Using Random forest classifier, the model was trained using
160 images of papaya leaves. The model could classify with
approximate 70 percent accuracy. The accuracy can be
increased when trained with vast number of images and by
using other local features together with the global features
such as SIFT (Scale Invariant Feature Transform), SURF
(Speed Up Robust Features) and DENSE along with BOVW
(Bag Of Visual Word)
The graph and table below gives the comparison of machine
learning algorithms.
REFERENCES
[1] S. S. Sannakki and V. S. Rajpurohit,” Classification of Pomegranate
Diseases Based on Back Propagation Neural Network,” International
Research Journal of Engineering and Technology (IRJET), Vol2 Issue:
02 | May-2015
[2] P. R. Rothe and R. V. Kshirsagar,” Cotton Leaf Disease Identification
using Pattern Recognition Techniques”, International Conference on
Pervasive Computing (ICPC),2015.
[3] Aakanksha Rastogi, Ritika Arora and Shanu Sharma,” Leaf Disease
Detection and Grading using Computer Vision Technology &Fuzzy
Logic” 2nd International Conference on Signal Processing and
Integrated Networks (SPIN)2015.
[4] Godliver Owomugisha, John A. Quinn, Ernest Mwebaze and James
Lwasa,” Automated Vision-Based Diagnosis of Banana Bacterial Wilt
Disease and Black Sigatoka Disease “, Preceding of the 1’st
international conference on the use of mobile ICT in Africa ,2014.
[5] uan Tian, Chunjiang Zhao, Shenglian Lu and Xinyu Guo,” SVM-based
Multiple Classifier System for Recognition of Wheat Leaf Diseases,”
Proceedings of 2010 Conference on Dependable Computing
(CDC’2010), November 20-22, 2010.
[6] S. Yun, W. Xianfeng, Z. Shanwen, and Z. Chuanlei, “Pnn based crop
disease recognition with leaf image features and meteorological data,”
International Journal of Agricultural and Biological Engineering, vol. 8,
no. 4, p. 60, 2015.
[7] J. G. A. Barbedo, “Digital image processing techniques for detecting,
quantifying and classifying plant diseases,” Springer Plus, vol. 2,
no.660, pp. 112, 2013.
[8] Caglayan, A., Guclu, O., & Can, A. B. (2013, September).
“A plant recognition approach using shape and color
features in leaf images.” In International Conference on
Image Analysis and Processing (pp. 161-170). Springer,
Berlin, Heidelberg.
[9] Zhen, X., Wang, Z., Islam, A., Chan, I., Li, S., 2014d. “Direct estimation
of cardiac bi-ventricular volumes with regression forests.” In: Accepted
by Medical Image Com- puting and Computer-Assisted Intervention
MICCAI 2014.
[10] Wang P., Chen K., Yao L., Hu B., Wu X., Zhang J., et al. (2016).”
Multimodal classification of mild cognitive impairment based on partial
least squares”.
Various Machine learning
model s
Accuracy(percent)
Logistic regression
65.33
Support vector machine
40.33
k- nearest neighbor
66.76
CART
64.66
Random Forests
70.14
Naïve Bayes
57.61
45
... Internal generators are employed for validation, testing, and training. Samples, models for training, monitoring, prediction charting, the confusion matrix, and the classification report [17,19,20]. Different methods to extract features are among the stated operations. ...
Preprint
Full-text available
Tomato plants are susceptible to various diseases that significantly impact crop yield and quality. Accurate and timely identification of these diseases is crucial for effective management and mitigation. This study presents a deep learning-based methodology for enhancing disease prediction, classification, and precise localization of affected areas within tomato leaves. The proposed approach leverages a combination of statistical, texture (Tamura and GLCM), geometry, and color features extracted from leaf images. To further enrich feature representation, wavelet analysis is employed. The model not only classifies ten prevalent tomato diseases but also estimates the proportion of affected leaf area, providing valuable insights for disease severity assessment. Evaluated on a dataset comprising 10,000 images, our model achieves remarkable accuracy of 99.50%. This robust performance underscores the efficacy of our approach in accurate disease diagnosis, benefitting farmers and researchers by enabling prompt intervention and efficient disease management strategies.
... SVM regression analysis classify five Alternaria species that attack cotton leaves Brown spot, gray mildew, blight, fusarium wilt plus cercospora leaf spot. Support Vector Machines, Kernel-Nearest Neighbors, Logistic Regression, Convolutional Artificial Neural Networks, and Gaussian Naive Bayes [10]. Papaya leaf images were used. ...
Article
The "Integrated Plant Disease Detection System (IPDDS)" presents a comprehensive approach to plant health assessment, combining image processing and convolutional neural networks (CNN) for species classification and disease identification. In the initial phase, input images undergo preprocessing to remove noise and enhance clarity, followed by segmentation using k-means clustering for effective region identification. A CNN classifier then utilizes deep learning techniques to categorize the plants into distinct species such as apple, core, grape, pepper bell, potato, or tomato. Subsequently, the system employs an improved CNN architecture adapted for disease classification, distinguishing various diseases affecting each plant species. For instance, diseases like Black Rot, Scab, and Cedar Rust are identified in apples, while Common Rust, Northern Blight, and Cercospora are detected in corn. This methodology enhances accuracy and reliability in disease detection, enabling timely interventions to mitigate crop losses. Furthermore, the system suggests suitable fertilizers based on disease diagnosis, facilitating targeted disease management strategies. This integrated approach offers a promising solution for effective plant disease detection, contributing to sustainable agriculture and food security.
Article
The system today operates using images, with its pre-processing that makes use of models like Inception-V3 CNN; such models are computational resource- demanding, requiring humongous sizes, and enormous data storage capacity. It is hence both time and cost-expensive. The proposed system uses the AlexNet CNN model in MATLAB R2021A with an accuracy of almost 99.6% to classify plant diseases, thus making it efficient, cost- effective, and faster due to the modules used such as acquisition of images, preprocessing, disease classification, and performance evaluation in the identification of ten diseases, which include Apple Black Rot and Grape Black Rot.
Conference Paper
Full-text available
Machine learning has been applied in agriculture in various areas including crop disease detection and image processing systems have been developed for some crops. These crops include cotton, pomegranate plant, grapes, vegetables, tomatoes, potatoes and cassava among others. However, no machine learning techniques have been used in an attempt to detect diseases in the banana plant such as banana bacterial wilt (BBW) and banana black sigatoka (BBS) that have caused a huge loss to many banana growers. The study investigated various computer vision techniques which led to the development of an algorithm that consists of four main phases. In phase one, images of banana leaves were acquired using a standard digital camera. Phase two involves use of different feature extraction techniques to obtain relevant data to be used in phase three where images are classified as either healthy or diseased. Of the seven classifiers that were used in this study, Extremely Randomized Trees performed best in identifying the diseases achieving 0.96 AUC for BBW and 0.91 for BBS. Lastly, the performance of these classifiers was evaluated based on the area under the curve (AUC) analysis and best method to automatically diagnose these banana diseases was then recommended.
Conference Paper
Full-text available
In Agriculture, leaf diseases have grown to be a dilemma as it can cause significant diminution in both quality and quantity of agricultural yields. Thus, automated recognition of diseases on leaves plays a crucial role in agriculture sector. This paper imparts a simple and computationally proficient method used for leaf disease identification and grading using digital image processing and machine vision technology. The proposed system is divided into two phases, in first phase the plant is recognized on the basis of the features of leaf, it includes pre-processing of leaf images, and feature extraction followed by Artificial Neural Network based training and classification for recognition of leaf. In second phase the disease present in the leaf is classified, this process includes K-Means based segmentation of defected area, feature extraction of defected portion and the ANN based classification of disease. Then the disease grading is done on the basis of the amount of disease present in the leaf.
Conference Paper
Full-text available
Recognizing plants is a vital problem especially for biologists, chemists, and environmentalists. Plant recognition can be performed by human experts manually but it is a time consuming and low-efficiency process. Automation of plant recognition is an important process for the fields working with plants. This paper presents an approach for plant recognition using leaf images. Shape and color features extracted from leaf images are used with k-Nearest Neighbor, Support Vector Machines, Naive Bayes, and Random Forest classification algorithms to recognize plant types. The presented approach is tested on 1897 leaf images and 32 kinds of leaves. The results demonstrated that success rate of plant recognition can be improved up to 96% with Random Forest method when both shape and color features are used.
Article
Full-text available
This paper presents a survey on methods that use digital image processing techniques to detect, quantify and classify plant diseases from digital images in the visible spectrum. Although disease symptoms can manifest in any part of the plant, only methods that explore visible symptoms in leaves and stems were considered. This was done for two main reasons: to limit the length of the paper and because methods dealing with roots, seeds and fruits have some peculiarities that would warrant a specific survey. The selected proposals are divided into three classes according to their objective: detection, severity quantification, and classification. Each of those classes, in turn, are subdivided according to the main technical solution used in the algorithm. This paper is expected to be useful to researchers working both on vegetable pathology and pattern recognition, providing a comprehensive and accessible overview of this important field of research.
Article
In recent years, increasing attention has been given to the identification of the conversion of mild cognitive impairment (MCI) to Alzheimer's disease (AD). Brain neuroimaging techniques have been widely used to support the classification or prediction of MCI. The present study combined magnetic resonance imaging (MRI), 18F-fluorodeoxyglucose PET (FDG-PET), and 18F-florbetapir PET (florbetapir-PET) to discriminate MCI converters (MCI-c, individuals with MCI who convert to AD) from MCI non-converters (MCI-nc, individuals with MCI who have not converted to AD in the followup period) based on the partial least squares (PLS) method. Two types of PLS models (informed PLS and agnostic PLS) were built based on 64 MCI-c and 65 MCI-nc from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The results showed that the three-modality informed PLS model achieved better classification accuracy of 81.40%, sensitivity of 79.69%, and specificity of 83.08% compared with the single-modality model, and the three-modality agnostic PLS model also achieved better classification compared with the two-modality model. Moreover, combining the three modalities with clinical test score (ADAS-cog), the agnostic PLS model (independent data: florbetapir-PET; dependent data: FDG-PET and MRI) achieved optimal accuracy of 86.05%, sensitivity of 81.25%, and specificity of 90.77%. In addition, the comparison of PLS, support vector machine (SVM), and random forest (RF) showed greater diagnostic power of PLS. These results suggested that our multimodal PLS model has the potential to discriminate MCI-c from the MCI-nc and may therefore be helpful in the early diagnosis of AD.
Article
An automatic crop disease recognition method was proposed in this paper, which combined the statistical features of leaf images and meteorological data. The images of infected crop leaves were taken under different environments of the growth periods, temperature and humidity. The methods of image morphological operation, contour extraction and region growing algorithm were adopted for leaf image enhancement and spot image segmentation. From each image of infected crop leaf, the statistical features of color, texture and shape were extracted by image processing, and the optimal meteorological features with the highest accuracy rate were obtained and selected by the attribute reduction algorithm. The fusion feature vector of the image was formed by combining the statistical features and the meteorological features. Then the probabilistic neural networks (PNNs) classifier was adopted to evaluate the classification accuracy. The experimental results on three cucumber diseased leaf image datasets, i.e., downy mildew, blight and anthracnose, showed that the crop diseases can be effectively recognized by the integrated application of leaf image processing technology, the disease meteorological data and PNNs classifier, and the recognition accuracy rate was higher than 90%, which indicated that the PNNs classifier trained on the disease feature coefficients extracted from the crop disease leaves and meteorological data could achieve higher classification accuracy. © 2015, Chinese Society of Agricultural Engineering. All rights reserved.
Conference Paper
Accurate estimation of ventricular volumes plays an essential role in clinical diagnosis of cardiac diseases. Existing methods either rely on segmentation or are restricted to direct estimation of the left ventricle. In this paper, we propose a novel method for direct and joint volume estimation of bi-ventricles, i.e., the left and right ventricles, without segmentation and user inputs. Based on the cardiac image representation by multiple and complementary features, we adopt regression forests to jointly estimate the two volumes. Our method is validated on a dataset of 56 subjects with a total of 3360 MR images which shows that our method can achieve a high correlation coefficient of around 0.9 with manual segmentation obtained by human experts. With our proposed method, the most daily-used estimation of cardiac function, e.g., ejection fraction, can be conducted in a much more efficient, accurate and convenient way.
SVM-based Multiple Classifier System for Recognition of Wheat Leaf Diseases
  • Chunjiang Uan Tian
  • Shenglian Zhao
  • Xinyu Lu
  • Guo
uan Tian, Chunjiang Zhao, Shenglian Lu and Xinyu Guo," SVM-based Multiple Classifier System for Recognition of Wheat Leaf Diseases," Proceedings of 2010 Conference on Dependable Computing (CDC'2010), November 20-22, 2010.