Conference PaperPDF Available

Automated Vision-Based Diagnosis of Banana Bacterial Wilt Disease and Black Sigatoka Disease

Authors:

Abstract and Figures

Machine learning has been applied in agriculture in various areas including crop disease detection and image processing systems have been developed for some crops. These crops include cotton, pomegranate plant, grapes, vegetables, tomatoes, potatoes and cassava among others. However, no machine learning techniques have been used in an attempt to detect diseases in the banana plant such as banana bacterial wilt (BBW) and banana black sigatoka (BBS) that have caused a huge loss to many banana growers. The study investigated various computer vision techniques which led to the development of an algorithm that consists of four main phases. In phase one, images of banana leaves were acquired using a standard digital camera. Phase two involves use of different feature extraction techniques to obtain relevant data to be used in phase three where images are classified as either healthy or diseased. Of the seven classifiers that were used in this study, Extremely Randomized Trees performed best in identifying the diseases achieving 0.96 AUC for BBW and 0.91 for BBS. Lastly, the performance of these classifiers was evaluated based on the area under the curve (AUC) analysis and best method to automatically diagnose these banana diseases was then recommended.
Content may be subject to copyright.
Automated Vision-Based Diagnosis of Banana
Bacterial Wilt Disease and Black Sigatoka Disease
Godliver Owomugisha, John A. Quinn, Ernest Mwebaze
Department of Computer Science
Makerere University
Email: [owomugisha.godliver, jquinn, emwebaze]@cis.mak.ac.ug
James Lwasa
Department of Information Systems
Makerere University
Email: lwasaj@yahoo.com
Abstract—Machine learning has been applied in agriculture
in various areas including crop disease detection and image
processing systems have been developed for some crops. These
crops include cotton, pomegranate plant, grapes, vegetables,
tomatoes, potatoes and cassava among others. However, no
machine learning techniques have been used in an attempt to
detect diseases in the banana plant such as banana bacterial
wilt (BBW) and banana black sigatoka (BBS) that have caused a
huge loss to many banana growers. The study investigated various
computer vision techniques which led to the development of an
algorithm that consists of four main phases. In phase one, images
of banana leaves were acquired using a standard digital camera.
Phase two involves use of different feature extraction techniques
to obtain relevant data to be used in phase three where images are
classied as either healthy or diseased. Of the seven classiers that
were used in this study, Extremely Randomized Trees performed
best in identifying the diseases achieving 0.96 AUC for BBW and
0.91 for BBS. Lastly, the performance of these classiers was
evaluated based on the area under the curve (AUC) analysis and
best method to automatically diagnose these banana diseases was
then recommended.
I. INTRODUCTION
Banana is the fourth most grown crop in the world after
wheat, rice and maize and Uganda happens to be the second
largest producer of bananas after India [1]. The crop is used
as a staple food source in the country however, its growth is
threatened by banana bacterial wilt (BBW) disease caused
by xanthomonas campestris pv musacearum (XCM) [2]. The
wilt originated in Ethiopia and in Uganda it was reported
by Tushemereirwe et al. [3] in Kayunga district in 2001.
The disease has also moved into Congo, into Rwanda and
Tanzania. BBW affects all types of banana and spreads
very fast causing a devastating effect hence, many farmers
have lost their crops and this has led to reduction of food
availability and income for banana farmers. The disease is
also coupled with many costs including labor for cutting
down and disposing off infected plants, de-budding the male
owers and disinfecting cutting tools.
Following the outbreak of BBW epidemics, the government of
the Republic of Uganda through the Ministry of Agriculture,
Animal industry and Fisheries (MAAIF) in conjunction with
National Agricultural Research Organization and other key
stake holders constituted a national task force in December
2001, which in November 2003 formulated long-term strategy
and action plan to eradicate the disease. This strategy includes
a national coordinated effort of continuous monitoring of
the epidemics: awareness raising and training campaigns,
empowering all stakeholders at all (district, sub county, parish
and village) levels to control the disease [4].
The problem of identifying diseases in plants is a very
well known one. Farmers wait for that time when the disease
gets to a late stage and the symptoms are visible to realize
that the crops are diseased. However, not much can be
done to control the situation by that stage, hence this study
aimed at early disease detection. The symptoms are visible
in the leaves, male bud, fruit and stem. The disease begins
with any leaf and causes them to turn yellow, brown and
later they wilt.Young affected plants become stunted and
may not produce any fruits. Apart from the BBW disease,
Tushemereirwe et al. [3] mention other diseases that have led
to the decline in banana production in many banana growing
countries in the world including Uganda and these include:
banana strake virus disease and banana black sigatoka (BBS).
BBS blackens parts of the leaf and normally, drying starts
from the edges and eventually the entire leaf is killed.
This paper is divided into ve sections: starting with
the introduction. Section two presents research that has been
done on crop disease detection. Section three describes how
different feature extraction and classication methods were
applied to achieve the objectives of the study. Results of the
techniques used are evaluated in section four and the last
section recommends methods that worked best and future
work.
II. RE LATE D WORK IN COMPUTER VISION FOR
AGRICULTURAL DISEASE DETECTION
Computer vision systems have been used increasingly
in the food and agricultural areas for quality inspection [5],
[6] and evaluation purposes as they provide suitably rapid,
economic, consistent and objective assessment. They have
proved to be successful for the objective measurement and
assessment of several agricultural products [7]. With the
advantages of superior speed and accuracy, a signicant
number of researchers have been attracted to apply machine
vision techniques in crop disease detection.
A support vector machine technique has been used for
classication and identication of foliar diseases in cotton [8].
The classication process starts by nding the best feature
vector for each class and then creates the nal classication
system from the best results obtained. To accomplish this,
the following were considered: decomposition of images into
1st
multiple channels (R, G, B, H, S, V, I3a, I3b, and GL),
application of the discrete wavelet transform up to the third
level, computation of the energy for each sub-band and
feature vectors. This is followed by creation of the SVM
classication environment, listing of the images used for
training and testing and evaluation of the best feature vectors.
Al-Hiary et al. [9] proposed an automatic detection and
classication of leaf diseases and the work is divided
into three parts. This begins with the identication of the
infected object(s) based upon K-means clustering procedure,
extraction of the feature set of the infected objects using color
co-occurrence methodology for texture analysis and nally
detection and classication of disease type using articial
neural network (ANNs).
Aduwo et al. [10] present an automated vision-based
diagnosis of cassava mosaic disease. The proposed algorithm
is based on camera-phone input to provide a more efcient
solution. The methodology begins with capturing leaf images
with a standard digital camera. The captured image is then
processed by applying various image processing techniques
such as SIFT, SURF and HSV for shape feature extraction.
The image is either classied as diseased or not based on
other methods like a k-nearest neighbor classier (KNN),
support vector classier (SVC) and Naive Bayes among
others. A comparison on the different classiers was done
and results for the three main datasets were produced.
Others [7], [11] have demonstrated the value of image
processing in inspecting and grading the quality of agricultural
and food products. An automated system for the disease
detection and grading in pomegranate plant was proposed in
[11]. The techniques used here include color segmentation
based on linear discriminant analysis, contour curvature
analysis and a thinning process, which involves iterating until
the stem becomes a skeleton.
The approach in [9] uses color co-occurrence methodology
for texture analysis which makes it not applicable for banana
leaves. However, the developed algorithm combined the
features extracted in [8], [10] and this added strength to the
results. In addition to the classiers that were used, the study
investigated on the behavior of other classication techniques
on the dataset and recommended the best methods. This has
not been done in the past for banana diseases thus making
this research new.
III. METHODS AND RES ULTS
The methodology aimed at detecting the BBW and BBS
diseases using automated vision-based diagnosis techniques
and work was divided into 4 parts: image acquisition, feature
extraction, disease classication and evaluation of the classi-
cation performance.
A. Image acquisition
A Canon digital camera of 12 megapixels was used to
capture both healthy and diseased images from different ba-
nana plantations in Bushenyi district (Western part of Uganda)
where these diseases are common. Samples were taken from 5
sub-counties at an average of 5 diseased plants per plantation.
A total of 623 image samples was used for this study and data
was organized in three sets. Set one holds 360 leaves from
healthy plants, set two has 220 leaves diseased with BBW and
set three has 43 leaves diseased with BBS. In order to capture
clear images with descriptive details, the camera was kept in
both horizontal and vertical resolution of 72dpi (dots per inch).
The ash mode was off since images were taken during day
time with enough natural light and the process did not involve
any cutting/removal of leaves off the plant. One sample image
from each set is given in the gures below.
Fig. 1. Healthy
leaf
Fig. 2. Leaf affected by BBW
disease
Fig. 3. Leaf affected by BBS disease
B. Feature extraction and creation of feature vectors
Most of the time, the captured images may contain many
objects especially in the background and working with such
images leads to inaccurate/incorrect results. These images were
cropped in order to obtain the leaf part only. However, cropped
images then had a white background with pixel values of 255
and working with the whole image also brings inappropriate
results too. To avoid this challenge a mask was applied onto
the image in order to obtain the useful segment. The region
with most green pixels was identied and basing on threshold
value of gray <200; green components of the pixel intensities
are set to one and the background is set to zero. This converts
an image into binary, thus indicating the segmentation of the
leaf from the background. This mask was then applied onto
the original image during histogram calculation as follows:
the pixels with zero components were deleted (by multiplying
the mask pixel values with the pixels of the original image)
and only the region where the pixels are ones was considered
during histogram calculation. Color histograms were extracted
and transformation was from RGB to HSV, RGB to L*a*b*.
Fig.4 is a mask of Fig.1
Shape was also considered for this study and the
process of calculating shape features was based on three
routines namely, thresholding at different levels, extracting of
connected components, calculating morphological features for
each connected component. First, each image is thresholded at
1st
Fig. 4. Mask
gray level. Connectivity openings [12] were used to calculate
all the components in each thresholded image. These are
called the peak components and were used to construct a
max tree which is a data structure designed for morphological
image processing in order to efciently compute features or
attributes of the connected components (following the same
methodology as [13]). This process was done for every image
and various morphological features were calculated for the
connected components. Five shape attributes were therefore
chosen to be more important and these include: Area of
minimum enclosing rectangle, elongation, small compactness,
small perimeter and Moment of Inertia.
The minimum bounding rectangle also called minimum
bounding box is the smallest rectangle that contains every
point in the shape. For an arbitrary shape, eccentricity is the
ratio of the length Land width Wof minimal bounding
rectangle of the shape at some set of orientations. Elongation,
Elo, is based on eccentricity [14].
Elo = 1
W
L
Fig. 5. Minimum bounding rectangle and corresponding parameters for
elongation
The compactness measure of a shape is a numerical
quantity representing the degree to which a shape is compact
and one of the compact measures of shape is surface area
/ volume. Perimeter of an object is the distance around
the outside of the object. Unlike regular shapes where at
least two sides or angles are the same, irregular objects do
not have these instances of symmetry and perimeter can
be determined if one takes into consideration each edge
of the shape. This can either be from the left or right,
bottom or top. Moment of Inertia is area (mass) times the
square of perpendicular distance to the rotation axis, I=Ad2.
To create feature vectors, histogram data for color components
H for HSV, R for RGB and L* for L*a*b* was extracted.
These components were also combined, for example HS,
HV or SV and some classiers yield better results. Another
comparison was done where classication was based on the
extracted shape features combined with the color histogram
features. To avoid dealing with huge data and overtting, only
50 bins were used for each case and the histograms were
normalized as well.
C. Disease classication
Classiers map an unlabeled instance of color histogram
feature vectors (or a combination of color histogram feature
vectors with shape features vectors) to a label. The seven
classiers used in this study were: Nearest Neighbors [15],
Decision tree [16], [17], Random forest [18], [19], Extremely
Randomized Trees [20], Naive Bayes [21] and support vector
classier (Linear SVM and RBF SVM) [22], [23], [24],
[25]. The method used for splitting data set into training
and testing was the k-fold cross-validation sometimes called
rotation estimation method. The dataset was randomly split
into mutually exclusive subsets (folds) of equal size of 10 [26].
The implementation platform was python with Opencv and
Scikit-learn libraries. Data and source code used in achieving
this are available at https://github.com/godliver/source-code-
BBW-BBS.git.
IV. RES ULT S
The choice made on which algorithm (classier) performed
best was based on the results of the AUC analysis. A com-
parison of the true positive rate and false positive rate for
the different classiers was done. If a classier yields an
AUC score of 1.0, then it has predicted perfectly. 0.5 is a
random performance and below 0.5 means the classier is
anti-correlated with the target. Different tests were made for
various color components with shape features but excellent
performance was generated when the color components H and
S for HSV were combined with the ve shape attributes that
were selected. The AUC results for the different classes (BBW,
BBS and healthy) are shown in Fig 6, 7 and 8 respectively.
Of the seven classiers, Extremely Randomized Trees
yield a very high score. Both Random Forest and Extremely
Randomized Trees algorithms are ensemble methods. Both
algorithms are perturb-and-combine techniques specically
designed for trees. This means a diverse set of classiers
is created by introducing randomness in the classier con-
struction and the prediction of the ensemble is given as the
averaged prediction of the individual classiers [27]. Scikit-
learn implementation combines classiers by averaging their
probabilistic prediction, instead of letting each classier vote
for a single class [18]. However, with Extremely Randomized
Trees, randomness goes one step further in the way splits
are computed. As in random forests, a random subset of
candidate features is used, but instead of looking for the
most discriminative thresholds, thresholds are drawn at random
1st
Fig. 6. HS -color components with shape attributes (AUC for BBW)
Fig. 7. HS-color components with shape attributes (AUC for BBS)
for each candidate feature and the best of these randomly-
generated thresholds is picked as the splitting rule. This usually
allows to reduce the variance of the model a bit more, at the
expense of a slightly greater increase in bias [20]. Table 1
shows the results of Extremely Randomized Trees classier
dependent on the two leaf features. Whereas color has a greater
impact than shape features, AUC performance is better when
both features are combined.
V. C ONCLUSION
With a very high performance of 0.96, 0.91 and 0.99 AUC
for BBW, BBS and healthy classes respectively, this research
has proved that there is a consistent and more accurate way
to auto-detect these banana diseases rather than relying on
the previous strategies that have been used in [4]. It has
been shown how different feature extraction methods and
classication techniques are applied systematically in the
attempt to solve this problem. It is evident that the algorithm
is feasible and can well identify the two diseases. Features that
have been selected that work best for this application are when
H and S color components are combined with the ve shape
features that were chosen as most important. Among the seven
Fig. 8. HS-color components with shape attributes (AUC for healthy)
BBW BBS Healthy
Color 0.94 0.90 0.97
Shape 0.90 0.84 0.96
Color + Shape 0.96 0.91 0.99
TABLE I. AUC F OR EXT RE MELY RANDOMIZED TRE ES ( EXT RA
TRE ES)C LAS SI FIER
classiers that were used, Extremely Randomized Trees is
recommended because of its high performance on this data set.
The platform for automation of vision-based diagnosis
of BBW and BBS diseases provides a useful direction and
this work can be extended so that this works on a mobile
phone device. This adds exibility to the application since
farmers are able to move with their phones to the elds
and minimizes the cost of training personnel to monitor
banana plants in different regions. The tool could then provide
real-time information as farmers don’t need to wait for experts
as they can always send images to the server and then get
advice. There will always be consistency of results since
everyone uses the same tool. Two experts might give two
different judgements on the same image, but software will
always give the same answer. Other improvements that can
be brought to the current work include:
Investigating on the possibility of bananas ever getting
infected by both BBW and BBS diseases.
Adding another class to cater for healthy but mature
leaves that are beginning to age or leaves affected by
drought stress
Considering features of the other parts of the plant
such as the stem.
ACKNOWLEDGMENT
The authors gratefully acknowledge the AI-DEV group,
department of computer science, Makerere university for the
1st
helpful suggestions on improving this work. The authors would
also like to say thanks to the team at National Agricultural
Research Laboratories for being kind and helpful during the
data collection stage.
REFERENCES
[1] “The biology of bananas and plantains,Uganda National Council
for Science and Technology(UNCST) and Program for Biosafety Sys-
tems(PBS), 2007.
[2] L. Turyagyenda, G. Blomme, F. Ssekiwoko, E. Karamura, S. Mpiira,
and S. Eden-Green, “Rehabilitation of banana farms destroyed by
xanthomonas wilt in uganda,” Journal of Applied Biosciences, vol. 8,
no. 1, p. 230 235, 2008.
[3] W. Tushemereirwe, A. Kangire, J. Kubiriba, M. Nakyanzi, and C. Gold,
“Diseases threatening banana biodiversity in uganda,African Crop
Science Jounal, vol. 12, no. 1, pp. 19–26, 2004.
[4] O. O. W.K Tushemereirwe, D Ngambeki, “Awareness of banana bacte-
rial wilt control in uganda: 2. community leaders perspectives,African
Crop Science Journal, vol. 14, no. 2, p. 166, 2006.
[5] R. H. Asankhani and H. Navid, “Qualitative sorting of potatoes by color
analysis in machine vision system,” Journal of Agricultural Science,
vol. 4, no. 4, 2012.
[6] T. Brosnan and D.-W. Sun, “Improving quality inspection of food
products by computer visiona review,” Journal of Food Engineering,
2003.
[7] N. V. G and H. K. S, “Quality inspection and grading of agricultural and
food products by computer vision,” International Journal of Computer
Applications, vol. 2, no. 1, May 2010.
[8] A. A. Bernardes, J. G. Rogeri, N. Marranghello, and A. S. Pereira,
“Identication of foliar diseases in cotton crop,” Topics in Medical
Image Processing and Computational Vision, vol. 8, pp. pp 67–85, 2013.
[9] H. Al-Hiary, S. Bani-Ahmad, M. Reyalat, M. Braik, and Z. Al-
Rahamneh, “Fast and accurate detection and classication of plant
diseases,” International Journal of Computer Applications, vol. 17,
no. 1, pp. 31–38, March 2011.
[10] J. R. Aduwo, E. Mwebaze, and J. A. Quinn, “Automated vision-based
diagnosis of cassava mosaic disease,Proceedings of ICDM Workshop
on Data Mining in Agriculture, 2010.
[11] S. Sanvakki, V. Rajpurohit, V. Nargund, R. Arunkunar, and P. Yallur, “A
hybrid intelligent system for automated pomegranate disease detection
and grading,” International Journal of Machine Intelligence, vol. 3,
no. 2, pp. 36–44, 2011.
[12] C. Ronse, “Set-theoretical algebraic approaches to connectivity in
continuous or digital spaces,” Journal of Mathematical Imaging and
Vision, 1998.
[13] J. A. Quinn, A. Andama, I. Munabi, and F. N. Kiwanuka, “Automated
blood smear analysis for mobile malaria diagnosis,” Mobile Point-of-
Care Monitors and Diagnostic Device Design, CRC Press, 2014.
[14] Y. Mingqiang, K. Kidiyo, and R. Joseph, “A survey of shape feature
extraction techniques,” Pattern Recognition, pp. 43–90, 2008.
[15] Z. Ma and K. Ata, “K-nearest-neighbours with a novel similarity
measure for intrusion detection,” UKCI’13, pp. 266–271, 2013.
[16] O. Maimon and L. Rokach, “Data mining and knowledge discovery
handbook, second edition,” April 2010.
[17] L. Ruey-Hsia and B. Geneva G, “Instability of decision tree classica-
tion algorithms,” Proceedings of the eighth ACM SIGKDD international
conference on Knowledge discovery and data mining, pp. 570–575,
2002.
[18] B. S. Leo and L. Breiman, “Random forests,Machine Learning, pp.
5–32, 2001.
[19] L. Andy and W. Matthew, “Classication and regression by randomfor-
est,” R News, vol. 2, no. 3, pp. 18–22, 2002.
[20] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,
Mach. Learn., vol. 63, no. 1, pp. 3–42, 2006.
[21] H. Zhang, “The optimality of naive bayes,” Proceedings of the Sev-
enteenth International Florida Articial Intelligence Research Society
Conference (FLAIRS 2004), 2004.
[22] S. R. Gunn, “Support vector machines for classication and regression,
University of Southampton, Technical Report, 1998.
[23] S. Keerthi, O. Chapelle, and D. DeCoste, “Building support vector
machines with reduced classier complexity,Journal of Machine
Learning Research, vol. 7, pp. 1493–1515, 2006.
[24] C. wei Hsu, C. chung Chang, and C. jen Lin, “A practical guide to
support vector classication,” National Taiwan University, Taipei 106,
Taiwan, 2010.
[25] N. Cristianini and J. Shawe-Taylor, “An introduction to support vec-
tor machines: And other kernel-based learning methods,” Cambridge
University Press, 2000.
[26] R. Kohavi, “A study of cross-validation and bootstrap for
accuracy estimation and model selection,” Proceedings of the
14th international joint conference on Articial intelligence
- Volume 2, pp. 1137–1143, 1995. [Online]. Available:
http://dl.acm.org/citation.cfm?id=1643031.1643047
[27] Breiman, “Arcing classiers,” Annals of Statistics, 1998.
1st
... For instance: a . Random Forest (RF) used by Owomugisha et al. (2014), , , and Gómez-Selvaraj et al. (2020). ...
... Classification trees (CART) applied by Manjunath et al. (2019) d . Decision tree algorithm (C5.0) applied by Owomugisha et al. (2014) e . Linear discriminant analysis (LDA) developed by Companioni et al. (2005) f . ...
... Owomugisha et al. 2014, Ma et al. 2017Gomez-Selvaraj et al. 2020), artificial neural networks, support vector machine (SVM)(Vipinadas and Thamizharasi, 2016;), and decision trees(Owomugisha et al. 2014), among others. ...
Thesis
Full-text available
Banana, the edible fruit of Musaceae, is a staple food for more than 400 million people worldwide due to their nutritional and energy attributes. This makes Musaceae a crop of worldwide relevance, particularly in tropical regions, highlighting the impact of improved Musaceae cropping systems in the current efforts worldwide oriented towards a new agricultural revolution based on sustainable intensification. To achieve this, better practices for food production based on scientific and technical research capable to consider the complexity and variability within the agri-food sector are necessary. The research presented in this PhD Thesis is oriented towards providing answers to the causes of two aspects considered of high relevance for banana production, both affecting productivity and sustainability, always addressed for the Venezuelan conditions, one of the world’s largest producing countries: 1- The impact of phytosanitary risks related to Fusarium wilt and the influence of the soil on the incidence of Banana Wilt (BW) caused by a fungal-bacterial complex. 2- An observed trend towards loss of productivity and decline of soil quality in some commercial farms of Aragua and Trujillo states in Venezuela. This PhD Thesis, has combined a systematic bibliographic review, crop and soil information from a systematic survey of different farm types in Venezuela with soil profile descriptions. Using that information, it has validated the hypothesis that by identifying the abiotic properties of the soil, the predisposition of the banana plant to the BW disease, and the potential productivity of the crop can be predicted. This approach can allow the differentiation of zones with different levels of productivity and BW risk, and as an immediate consequence, avoid areas of high risk or low productivity, or adapt agronomical practices to enhance productivity and sustainability of banana cropping systems in Venezuela.
... In recent times, modern approaches, such as machine learning and deep learning algorithms, have been employed to identify the characteristics of banana agroecosystems that could be affecting productivity and the appearance of diseases in the field. Several investigations were carried out in the field of machine learning for the detection and diagnosis of banana diseases, using RF [11,12,[26][27][28], artificial neural networks [11], support vector machine (SVM) [10,11,29,30] and decision trees [26], among others. This study aimed to use a RF model analysis strategy to determine the soil variables that could favor the development of BW disease, with the final aim of helping to avoid using those soils or promoting the application of the appropriate corrective fertilization treatments. ...
... In recent times, modern approaches, such as machine learning and deep learning algorithms, have been employed to identify the characteristics of banana agroecosystems that could be affecting productivity and the appearance of diseases in the field. Several investigations were carried out in the field of machine learning for the detection and diagnosis of banana diseases, using RF [11,12,[26][27][28], artificial neural networks [11], support vector machine (SVM) [10,11,29,30] and decision trees [26], among others. This study aimed to use a RF model analysis strategy to determine the soil variables that could favor the development of BW disease, with the final aim of helping to avoid using those soils or promoting the application of the appropriate corrective fertilization treatments. ...
Article
Full-text available
Over the last few decades, a growing incidence of Banana Wilt (BW) has been detected in the banana-producing areas of the central zone of Venezuela. This disease is thought to be caused by a fungal–bacterial complex, coupled with the influence of specific soil properties. However, until now, there was no consensus on the soil characteristics associated with a high incidence of BW. The objective of this study was to identify the soil properties potentially associated with BW incidence, using supervised methods. The soil samples associated with banana plant lots in Venezuela, showing low (n = 29) and high (n = 49) incidence of BW, were collected during two consecutive years (2016 and 2017). On those soils, sixteen soil variables, including the percentage of sand, silt and clay, pH, electrical conductivity, organic matter, available contents of K, Na, Mg, Ca, Mn, Fe, Zn, Cu, S and P, were determined. The Wilcoxon test identified the occurrence of significant differences in the soil variables between the two groups of BW incidence. In addition, Orthogonal Least Squares Discriminant Analysis (OPLS-DA) and the Random Forest (RF) algorithm was applied to find soil variables capable of distinguishing banana lots showing high or low BW incidence. The OPLS-DA model showed a proper fitting of the data (R2Y: 0.61, p value < 0.01), and exhibited good predictive power (Q2: 0.50, p value < 0.01). The analysis of the Receiver Operating Characteristics (ROC) curves by RF revealed that the combination of Zn, Fe, Ca, K, Mn and Clay was able to accurately differentiate 84.1% of the banana lots with a sensitivity of 89.80% and a specificity of 72.40%. So far, this is the first study that identifies these six soil variables as possible new indicators associated with BW incidence in soils of lacustrine origin in Venezuela.
... They have outperformed traditional imaging techniques in diagnostics of malaria, tuberculosis and intestinal parasite (Quinn et al., 2016). Deep learning has also been used in the diagnosis of crop diseases that attack cassava and bananas in East Africa (Owomugisha et al., 2014;Owomugisha and Mwebaze, 2016). With the help of deep learning, farmers have the potential to better diagnose poultry diseases and improve livestock health which would increase the production. ...
Article
Full-text available
Coccidiosis, Salmonella, and Newcastle are the common poultry diseases that curtail poultry production if they are not detected early. In Tanzania, these diseases are not detected early due to limited access to agricultural support services by poultry farmers. Deep learning techniques have the potential for early diagnosis of these poultry diseases. In this study, a deep Convolutional Neural Network (CNN) model was developed to diagnose poultry diseases by classifying healthy and unhealthy fecal images. Unhealthy fecal images may be symptomatic of Coccidiosis, Salmonella, and Newcastle diseases. We collected 1,255 laboratory-labeled fecal images and fecal samples used in Polymerase Chain Reaction diagnostics to annotate the laboratory-labeled fecal images. We took 6,812 poultry fecal photos using an Open Data Kit. Agricultural support experts annotated the farm-labeled fecal images. Then we used a baseline CNN model, VGG16, InceptionV3, MobileNetV2, and Xception models. We trained models using farm and laboratory-labeled fecal images and then fine-tuned them. The test set used farm-labeled images. The test accuracies results without fine-tuning were 83.06% for the baseline CNN, 85.85% for VGG16, 94.79% for InceptionV3, 87.46% for MobileNetV2, and 88.27% for Xception. Finetuning while freezing the batch normalization layer improved model accuracies, resulting in 95.01% for VGG16, 95.45% for InceptionV3, 98.02% for MobileNetV2, and 98.24% for Xception, with F1 scores for all classifiers above 75% in all four classes. Given the lighter weight of the trained MobileNetV2 and its better ability to generalize, we recommend deploying this model for the early detection of poultry diseases at the farm level.
... After the identification of diseases, grading was done based on the amount of disease present in the leaf. Owomugisha et al. (2014) proposed machine learning techniques to identify bacterial diseases in banana plants. The computer vision technique was investigated to make an algorithm that is further divided into four phases. ...
Article
Full-text available
Plant diseases are spread by a variety of pests, weeds, and pathogens and may have a devastating effect on agriculture, if not handled in a timely manner. Farmers face umpteen challenges from a proper water supply, untimely rain, storage facilities, and several plant diseases. Crops disease is the primary threat and it causes enormous loss to farmers in terms of production and finance. Identifying the disease from several hectares of agricultural land is a very difficult practice even with the presence of modern technology. Accurate and rapid illness prediction for early illness treatment to crops minimizes economical loss to the individual and further proves to be productive for healthy crops. Many studies use modern deep learning approaches to improve the accuracy and performance of object detection and identification systems. The suggested method notifies farmers of different agricultural illnesses, prompting them to take further essential precautions before the disease spreads to the whole agricultural field. The primary objective of this study is to detect the illnesses as soon as they begin to spread on the leaves of the plants. Super-Resolution Convolutional Neural Network (SRCNN) and Bicubic models are employed in the system to identify healthy and diseased leaves with an accuracy of 99.175 % and 99.156 % respectively.
... After the identification of diseases, grading was done based on the amount of disease present in the leaf. Owomugisha et al. (2014) proposed machine learning techniques to identify bacterial diseases in banana plants. The computer vision technique was investigated to make an algorithm that is further divided into four phases. ...
Article
Full-text available
Plant diseases are spread by a variety of pests, weeds, and pathogens and may have a devastating effect on agriculture, if not handled in a timely manner. Farmers face umpteen challenges from a proper water supply, untimely rain, storage facilities, and several plant diseases. Crops disease is the primary threat and it causes enormous loss to farmers in terms of production and finance. Identifying the disease from several hectares of agricultural land is a very difficult practice even with the presence of modern technology. Accurate and rapid illness prediction for early illness treatment to crops minimizes economical loss to the individual and further proves to be productive for healthy crops. Many studies use modern deep learning approaches to improve the accuracy and performance of object detection and identification systems. The suggested method notifies farmers of different agricultural illnesses, prompting them to take further essential precautions before the disease spreads to the whole agricultural field. The primary objective of this study is to detect the illnesses as soon as they begin to spread on the leaves of the plants. Super-Resolution Convolutional Neural Network (SRCNN) and Bicubic models are employed in the system to identify healthy and diseased leaves with an accuracy of 99.175 % and 99.156 % respectively.
... Tian used SVM as a classifier for finding the disease in wheat plants [17]. Owomugisha used decision trees, nearest neighbors, naïve Bayes, and random forest for diagnosing diseases in plants [18]. Hall et al. [19] used random forest and CNN on 32 species for classification of leaves and performance achieved is 97.3% classification accuracy. ...
Article
Full-text available
Detection of plant disease has a crucial role in better understanding the economy of India in terms of agricultural productivity. Early recognition and categorization of diseases in plants are very crucial as it can adversely affect the growth and development of species. Numerous machine learning methods like SVM (support vector machine), random forest, KNN ( k -nearest neighbor), Naïve Bayes, decision tree, etc., have been exploited for recognition, discovery, and categorization of plant diseases; however, the advancement of machine learning by DL (deep learning) is supposed to possess tremendous potential in enhancing the accuracy. This paper proposed a model comprising of Auto-Color Correlogram as image filter and DL as classifiers with different activation functions for plant disease. This proposed model is implemented on four different datasets to solve binary and multiclass subcategories of plant diseases. Using the proposed model, results achieved are better, obtaining 99.4% accuracy and 99.9% sensitivity for binary class and 99.2% accuracy for multiclass. It is proven that the proposed model outperforms other approaches, namely LibSVM, SMO (sequential minimal optimization), and DL with activation function softmax and softsign in terms of F -measure, recall, MCC (Matthews correlation coefficient), specificity and sensitivity.
Article
Full-text available
Agricultural productivity is the asset on which the world’s economy thoroughly relies. This is one of the major causes that disease identification in fruits and plants occupies a salient role in farming space, as having disease disorders in them is obvious. There is a need to carry genuine supervision to avoid crucial consequences in vegetation; otherwise, corresponding vegetation standards, quantity, and productiveness gets affected. At present, a recognition system is required in the food handling industries to uplift the effectiveness of productivity to cope with demand in the community. The study has been carried out to perform a systematic literature review of research papers that deployed machine learning (ML) techniques in agriculture, applicable to the banana plant and fruit production. Thus; it could help upcoming researchers in their endeavors to identify the level and kind of research done so far. The authors investigated the problems related to banana crops such as disease classification, chilling injuries detection, ripeness, moisture content, etc. Moreover, the authors have also reviewed the deployed frameworks based on ML, sources of data collection, and the comprehensive results achieved for each study. Furthermore, ML architectures/techniques were evaluated using a range of performance measures. It has been observed that some studies used the PlantVillage dataset, a few have used Godliver and Scotnelson dataset, and the rest were based on either real-field image acquisition or on limited private datasets. Hence, more datasets are needed to be acquired to enhance the disease identification process and to handle the other kind of problems (e.g. chilling injuries detection, ripeness, etc.) present in the crops. Furthermore, the authors have also carried out a comparison of popular ML techniques like support vector machines, convolutional neural networks, regression, etc. to make differences in their performance. In this study, several research gaps are addressed, allowing for increased transparency in identifying different diseases even before symptoms arise and also for monitoring the above-mentioned problems related to crops.
Article
Full-text available
Machine vision system is a modern technique that is used for grading of wide range of agricultural crops. Objective of this research is qualitative sorting of potatoes by means of lighting chamber, Camera, frame grabber and computer for catching proper images and analysis of them by MATLAB software. 110 numbers of Agria potatoes were selected randomly and placed in same lighting conditions. The images were transferred by frame grabber to computer memory to be analyzed. The samples had been pre-graded in the same face witch were placed in lighting chamber and percentage of health class was recorded. By performing pre-processing techniques on images, the compound of HSV color space and logarithmic transformation by coefficient of 0.5 was selected. The correction coefficient of health class of pre-graded method and results of implementing algorithm was 0.989 that it was the highest. Qualitative sorting accuracy in this method was 96.54%.
Chapter
Full-text available
The gold standard test for malaria is the hundred-year-old method of preparing a blood smear on a glass slide, staining it, and examining it under a microscope to look for the parasite genus Plasmodium. While several rapid diagnostic tests are also currently available, they still have shortcomings compared to microscopic analysis [18]. In the regions worst affected by malaria, reliable diagnoses are often difficult to obtain, and treatment is routinely prescribed based only on symptoms. Accurate diagnosis is clearly important, since false negatives can be fatal and false positives lead to increased drug resistance, unnecessary economic burden, and possibly the failure to treat diseases with similar early symptoms such as meningitis or typhoid. The scale of the problem is huge: annually there are 300-500 million cases of acute malaria illness of which 1.1-2.7 million are fatal, most fatalities being among children under the age of five [27,21,22].
Conference Paper
Full-text available
We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on arti cial data and theoretical results in restricted settings have shown that for selecting a good classi er from a set of classiers (model selection), ten-fold cross-validation may be better than the more expensive leaveone-out cross-validation. We report on a largescale experiment|over half a million runs of C4.5 and a Naive-Bayes algorithm|to estimate the e ects of di erent parameters on these algorithms on real-world datasets. For crossvalidation, we vary the number of folds and whether the folds are strati ed or not � for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-word datasets similar to ours, the best method to use for model selection is ten-fold strati ed cross validation, even if computation power allows using more folds. 1
Article
Full-text available
We propose and experimentally evaluate a software solution for automatic detection and classification of plant leaf diseases. The proposed solution is an improvement to the solution proposed in [1] as it provides faster and more accurate solution. The developed processing scheme consists of four main phases as in [1]. The following two steps are added successively after the segmentation phase. In the first step we identify the mostlygreen colored pixels. Next, these pixels are masked based on specific threshold values that are computed using Otsu's method, then those mostly green pixels are masked. The other additional step is that the pixels with zeros red, green and blue values and the pixels on the boundaries of the infected cluster (object) were completely removed. The experimental results demonstrate that the proposed technique is a robust technique for the detection of plant leaves diseases. The developed algorithm‟s efficiency can successfully detect and classify the examined diseases with a precision between 83% and 94%, and can achieve 20% speedup over the approach proposed in [1].
Book
Full-text available
Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, Second Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery in databases (KDD) into a coherent and unified repository. This handbook first surveys, then provides comprehensive yet concise algorithmic descriptions of methods, including classic methods plus the extensions and novel methods developed recently. This volume concludes with in-depth descriptions of data mining applications in various interdisciplinary industries including finance, marketing, medicine, biology, engineering, telecommunications, software, and security. Data Mining and Knowledge Discovery Handbook, Second Edition is designed for research scientists, libraries and advanced-level students in computer science and engineering as a reference. This handbook is also suitable for professionals in industry, for computing applications, information systems management, and strategic research management.
Article
Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ∗∗∗, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.
Conference Paper
K-Nearest-Neighbours is one of the simplest yet effective classification methods. The core computation behind it is to calculate the distance from a query point to all of its neighbours and to choose the closest one. The Euclidean distance is the most frequent choice, although other distances are sometimes required. This paper explores a simple yet effective similarity definition within Nearest Neighbours for intrusion detection applications. This novel similarity rule is fast to compute and achieves a very satisfactory performance on the intrusion detection benchmark data sets tested.