ArticlePDF Available

Are wing contours good classifiers for automatic identification in Odonata? A view from the Targeted Odonata Wing Digitization (TOWD) project

Authors:

Abstract and Figures

In recent decades, a lack of available knowledge about the magnitude, identity and distribution of biodiversity has given way to a taxonomic impediment where species are not being described as fast as the rate of extinction. Using Machine Learning methods based on seven different algorithms (LR, CART, KNN, GNB, LDA, SVM and RFC) we have created an automatic identification approach for odonate genera, through images of wing contours. The training population is composed of the collected specimens that have been digitized in the framework of the NSF funded Odomatic and TOWD projects. Each contour was pre-processed, and 80 coefficients were extracted for each specimen. These form a database with 4656 rows and 80 columns, which was divided into 70% for training and 30% for testing the classifiers. The classifier with the best performance was a Linear Discriminant Analysis (LDA), which discriminated the highest number of classes (100) with an accuracy value of 0.7337, precision of 0.75, recall of 0.73 and a F1 score of 0.73. Additionally, two main confusion groups are reported, among genera within the suborders of Anisoptera and Zygoptera. These confusion groups suggest a need to include other morphological characters that complement the wing information used for the classification of these groups thereby improving accuracy of classification. Likewise, the findings of this work open the door to the application of machine learning methods for the identification of species in Odonata and in insects more broadly which would potentially reduce the impact of the taxonomic impediment.
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
96Internaonal Journal of Odonatology Volume 25 pp. 96–106
Internaonal Journal of Odonatology
2022, Vol. 25, pp. 96–106
doi:10.48156/1388.2022.1917184
Research Arcle
OPEN ACCESS
This arcle is distributed
under the terms of the
Creave Commons
Aribuon License,
which permits unrestricted use,
distribuon, and reproducon in
any medium, provided the original
author and source are credited.
Published: 8 December 2022
Received: 11 June 2022
Accepted: 7 November 2022
Citaon:
Sáenz Oviedo, R. Kuhn, Rondon
Sepulveda, Abbo, Ware &
Sanchez-Herrera (2022):
Are wing contours good classiers
for automac idencaon in
Odonata? A view from the
Targeted Odonata Wing
Digizaon (TOWD) project.
Internaonal Journal of
Odonatology, 25,
96–106
doi:10.48156/1388.2022.1917184
Data Availability Statement:
All relevant data are
within the paper and its
Supporng Informaon les.
Are wing contours good classiers
for automac idencaon in Odonata?
A view from the Targeted Odonata
Wing Digizaon (TOWD) project
Mayra A. Sáenz Oviedo 1, William R. Kuhn 2, Marn A. Rondon Sepulveda 1,
John Abbo 3, Jessica L. Ware 4 & Melissa Sanchez-Herrera 4,5*
1 Department of Epidemiology and Biostascs. Poncia Universidad Javeriana, Ak. 7 # 40-62,
10231, Bogotá, Colombia
2 Discover life in America, Gatlinburg, TN, USA
3 Department of Museum Research & Collecons. University of Alabama Museums. Tuscaloosa,
AL 35487, USA
4 Division of Invertebrate Zoology. American Museum of Natural History. 200 Central Park West
at West 79th Street, New York, NY 10024, USA
5 Faculty of Natural Sciences, Biology Department, Universidad del Rosario,
Sede Quinta Mus. Ak. 26 # 63C-48, 11122, Bogotá, Colombia
* Corresponding author. Email: melsanc@gmail.com, melissa.sanchezh@urosario.edu.co
Abstract. In recent decades, a lack of available knowledge about the magnitude, identy
and distribuon of biodiversity has given way to a taxonomic impediment where species
are not being described as fast as the rate of exncon. Using Machine Learning methods
based on seven dierent algorithms (LR, CART, KNN, GNB, LDA, SVM and RFC) we have
created an automac idencaon approach for odonate genera, through images of wing
contours. The training populaon is composed of the collected specimens that have been
digized in the framework of the NSF funded Odomac and TOWD projects. Each contour
was pre-processed, and 80 coecients were extracted for each specimen. These form a
database with 4656 rows and 80 columns, which was divided into 70% for training and
30% for tesng the classiers. The classier with the best performance was a Linear Dis-
criminant Analysis (LDA), which discriminated the highest number of classes (100) with
an accuracy value of 0.7337, precision of 0.75, recall of 0.73 and a F1 score of 0.73. Ad-
dionally, two main confusion groups are reported, among genera within the suborders of
Anisoptera and Zygoptera. These confusion groups suggest a need to include other mor-
phological characters that complement the wing informaon used for the classicaon
of these groups thereby improving accuracy of classicaon. Likewise, the ndings of this
work open the door to the applicaon of machine learning methods for the idencaon
of species in Odonata and in insects more broadly which would potenally reduce the
impact of the taxonomic impediment.
Key words. Classicaon, Machine Learning, supervised, wings
Introducon
Dragonies and damselies (Odonata) are one of the most charismac insect
groups, due to their relavely big size, ight paerns and beauful coloraons.
Their associaon with aquac environments means they serve as excellent bioin-
dicators of water quality, given their high suscepbility to environmental changes
(Córdoba-Aguilar, 2008; Moore, 1997; Samways & Steytler, 1996). Global odonate
richness is esmated to comprise around 6.323 species (Paulson et al., 2021). This
is a relavely small number of species in comparison with other insect orders like
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
97Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
Coleoptera, which includes approximately 300,000 spe-
cies (Lorenzo-Carballa & Cordero Rivera, 2012; Paulson
personal communicaon, May 14, 2021). However, re-
searchers suspect that ~20% of species remain to be
discovered (Kalkman et al., 2008). The tangible and rel-
avely low diversity in odonates makes them an ideal
scenario to address the “taxonomic impediment”—a
lack of available knowledge about and trained exper-
se to determine the magnitude, identy, and distribu-
on of biodiversity (González, 2009). This phenomenon
is of parcular interest given the current worldwide
“biodiversity crisis” (i.e., rapid declining of populaons
as a result of massive habitat destrucon and climate
change), in which, there are esmates that ~50% of the
living species will face exncon in the next 50 years
(Koh et al., 2004). Maximizing eorts to gather and
learn the taxonomy and biology of species is more rel-
evant now than ever (Ceballos et al., 2015; Kuhn, 2016;
La Salle et al., 2016).
There are several morphological characteriscs that
dene the odonate suborders Zygoptera and Aniso-
ptera, including: characters regarding the shape of the
wings, head, thorax, abdomen and genitalia (Garrison
et al., 2006). In parcular, wing shape and venaon
paerns are one of the most commonly used traits to
classify dragonies and damselies to family or genus
level. For example, without the aid of a microscope
one can easily dierenate them because anisopterans
have dierent shapes of the fore- and hindwings, which
remain perpendicular to the body when in rest, while
zygopterans’ fore- and hindwings are similar in shape
and are usually folded in line with the body (Heckman,
2006, 2008). Recent contribuons by Appel & Gorb
(2014) proposed detailed micro-morphological charac-
teriscs of the wings such as the types of vein joints
and combinaons among them (i.e., four types of vein
joints and ve combinaons), spine distribuon across
the wings (i.e., located on transversal veins, possibly in-
volved in movement limitaon), and the distribuon of
patches of the exible protein resilin in the wings (e.g.,
on the joints, and/or along the veins). These new mor-
phological traits have been discussed in the classica-
on for both suborders, and are used to infer funcon
and ight behavior.
Recently, Kuhn (2016) developed an automac clas-
sicaon system for 26 dragony genera, using stan-
dardized image scans of specimen wings. He trained
and classied them using a random forest algorithm by
extracng feature vectors to describe texture and pat-
terning through Gabor Wavelet Filters and a color as-
sessment with a chromacity standardizaon sampling
within the images. Here we assessed the classicaon
power to genera of a novel classier trait for wings—
their contour. By using standardized wing images from
the Targeted Odonata Wing Digizaon project, we test-
ed mulple Machine Learning classicaon algorithms
(e.g., Linear Discriminant Analysis—LDA, Logisc Regres-
sion—LR, Classicaon and Regression Trees— CART,
K-Nearest Neighbors—KNN, Naive Bayes—NB, Support
Vector Machines—SVM, and Random Forest Classier—
RFC) to establish the potenal use of the wing contour
within automated classicaon systems for odonates.
Materials and methods
We analyzed data from the Targeted Odonata Wing
Digizaon Project (TOWD; hps://digizingdragon-
ies.org), which aims to digize the wings of all North
America species of Odonata and to develop tools for
automacally extracng useful characters from odo-
nate wings to facilitate comparave studies and au-
tomac species classicaon. We analyzed a dataset
comprising 2,328 dragony and damselies specimens
from 111 genera, which were digized through the
TOWD Project. The dataset consisted of the contour
(outline) of the fore- and hindwings of each specimen.
These data were extracted from digital scans of the
specimens using an edge-nding algorithm to recover
a series of points (x,y-coordinates) represenng the lo-
caon of each pixel along the edge of a wing. In most
cases, the contours represented the right wings, which
were excised from the specimen’s body and scanned
on a atbed scanner, except in some cases where the
le-side wings were scanned when the right ones were
damaged (see Supplementary Table 1 for a list of speci-
mens). In the laer case, wing contours were reected
le-to-right to match up with right-side wing contours.
As part of the TOWD preprocessing, each contour was
rotated so that the upper side (costal margin) is approx-
imately horizontal, translated so that the upper-le cor-
ner is at (0,0) and scaled to millimeters. The edges of
some wings were damaged, which was also apparent in
their respecve contours; such damage was used as an
exclusion criterion.
Contours data were preprocessed and analyzed in
Python (van Rossum & Drake Jr, 2009; v. 3.9.2) using
the Spyder Integrated Development Environment (IDE)
Spyder (Raybaut, 2009; v. 4.2.1), which is part of the
Anaconda Soware Distribuon (2016). Data treat-
ment was divided into four main steps (Available code:
hps://doi.org/10.5281/zenodo.6614239):
(i) Preprocessing and Fourier’s descriptors extracon:
Standardizaon was performed on every contour, to
ensure the comparability of data and improve the
classicaon accuracy (Pal & Sudeep, 2016). This
was accomplished by following a series of funcons
that returned a slightly modied set of coordinates
that fulll common main characteriscs: The con-
tour was closed by appending the rst coordinate to
the last one, in case these didn’t coincide; the direc-
on of the coordinates of every contour was veri-
ed and changed to be on a clockwise orientaon;
in case the contour contained less than 200 points,
some points were interpolated. Next, the apex of
the wing is located, and the coordinates are rotated
to make it the starng point. Finally, the contour
was checked again to ensure it had been closed.
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
98Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
Aer the preprocessing, the Fourier descriptor’s
coecients were extracted using the Python im-
plementaon for approximang contours with a
Fourier series, PyEFD (Blidh, 2016). This process al-
lowed the extracon of the same number of coef-
cients for each wing, regardless of their size. The
normalized coecients were kept in a separate
database, with each specimen’s unique idener
(uniq-id).
(ii) Database division in training and test datasets: A
train-test division was performed, following a 70/30
proporon: 70% to train the model and 30% for
tesng/validaon.
(iii) Denion, training, and tesng of classier algo-
rithms: Seven classiers were chosen to be trained
and tested for classicaon from the Scikit-learn
distribuon (Pedregosa et al., 2011):
Logisc Regression (LR) is a binary linear classier,
which is the simplest and is used as a baseline mod-
el. To adjust LR to a mulclass problem, where the
classicaon is done with a one vs rest method, the
opon mul_class = 'ovr' was set.
Classicaon and Regression Trees (CART) is a mul-
class classier that uses recursive paroning fol-
lowing the Gini Impurity Index to build a decision
tree.
K-Nearest Neighbors (KNN) is a mulclass classier
that assumes similarity depending on class proxim-
ity, calculated as an Euclidean distance.
Naïve Bayes (NB) is a mulclass classier that as-
sumes condional independence between every
pair of classes.
Linear Discriminant Analysis (LDA): is a linear clas-
sier for a mulclass problem. It ensures the maxi-
mum separability of classes by reinforcing the pro-
poron of intra and inter class variance (Narayan,
2020; Tharwat et al., 2017).
Support Vector Machines (SVM) build a hyperplane
or group of hyperplanes on a higher dimensionality
space that allow the separaon of nonlinear prob-
lems (Gandhi, 2018). The opon StandardScaler
was used to normalize and scale the data; and the
opon SVC, is used to specify the classicaon task.
Random Forest Classier (RFC) ts several decision
trees on dierent sub-samples of the data. To set
the number of trees in the forest, the opon n_es-
mators = 200 was set.
In addion, for each classier, a cross validaon
score and a classicaon report was obtained with
ve items: Accuracy (number of correct predic-
ons from the total number of predicons), Preci-
sion (number of true posives from all the posive
predicons), Recall (number of posive predicons
from the total number of posive classes), F1 score
(following equaon:
2 (True Posives (TP) × False Posives (FP) ÷ 2 TP +
FP + False Negaves (FN))
and Support (number of individuals in each class).
(iv) Confusion matrices: Confusion matrices were plot-
ted for each classier to obtain a detailed visual-
izaon of the classicaon errors: On them, the
predicted and real classes are found on the x- and
y-axis, respecvely. The correct predicons of the
classier are found on the diagonal where the pre-
dicted and true labels coincide. In consequence, the
predicons that lay outside of this diagonal, corre-
spond to classicaon errors that inform about the
performance of the classiers, as well as possible
confusion paerns.
Finally, we performed ANOVA and Tukey tests in order
to compare the accuracy and F1 scores from the clas-
sicaon report, along with box plots calculated from
the data.
Figure 1. (A) Accuracy (number of correct predicons from
the total number of predicons) and (B) F1 Score (a mea-
sure of a model’s accuracy on a dataset that follows the for-
mula: ((2 × Precision × Recall) ÷ (Precision + Recall)) boxplots
of 3-fold cross validaon for each of the seven classiers
tested. A total of 1397 individuals for each tesng dataset
per classier was used; same leers indicate non-signicant
comparisons, p-values are shown for the CART—NB and
SVM—RFC comparisons which were non-signicant for both
scores.
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
99Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
Results
We extracted a dataset of 4656 rows and 81 columns of
Fourier coecients, aer the preprocessing images and
the Fourier extracon loop we dened. Each row of this
dataset belongs to an individual organism and each col-
umn to one of the coecients. In total, we obtained 39
descriptors for each wing (hindwing and forewing) per
individual, to be used later in the classicaon.
To the laer database, we tagged the genus label to
each individual (row) in order to create a training and a
tesng set following a 70:30 proporon, respecvely.
As a result we generated 3259 individuals (70%) for the
training, and 1397 individuals (30%) for the tesng sets.
The accuracy scores were similar enough in all seven
classiers dened (LDA, SVC, LR, CART, NB, RFC, KNN)
between the two sets, which rules out possible over-
ng of the classicaon models (see Supplementary
Tables 2, 3). Furthermore, using the tesng set the clas-
sicaon report obtained showed that the LDA classi-
er had the best performance in terms of: (1) accuracy
(0.7337); (2) precision (0.75); (3) recall (0.73) and F1
score (0.73); in comparison with the other six classi-
ers tested (Supplementary Table 3). The ANOVAs per-
formed for the F1 score and accuracy were signicant
(Fig. 1; Supplementary Tables 4 + 6), across the models.
Figure 2. Confusion matrix. (A) LDA Confusion Matrix. next page. (B) Confusion Matrix showing misclassicaon zones distrib-
uted mainly on four families: Gomphidae (blue), Libellulidae (red), Coenagrionidae (green), Lesdae (orange). Each cell of the
matrix corresponds to every possible true label and predicted label pairing. The color bar on the side of each plot, shows the
code for the number of coincidences on each cell (from 0 = white, to 80 = dark blue).
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
100Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
The post-hoc Tukey mulple comparisons test showed
dierences for the accuracy and the F1 score compari-
son among all the classiers, with the excepon of the
CART and NB comparison, and the SVM and RFC com-
parison (Fig. 1; Supplementary Tables 5 + 7). The lat-
ter performance metrics relies on the calculated con-
fusion matrix per model, the LDA classier confusion
matrix shows the highest number of individuals on the
diagonal, meaning these are true posives (Fig. 2A).
Despite its beer performance we noced consistency
in parcular taxa that create misclassicaon in almost
all classiers, that we call confusion groups (Fig. 2B,
Supplementary Figures 1–6). Parcular genera within
the following four families—Gomphidae, Libellulidae,
Coen agrionidae and Lesdae—seem to be responsible
for the misclassicaon observed (Fig. 3).
Discussion
Image preprocessing funcons allowed a standardiza-
on of the coordinates on the contour dataset. This
process has been found to guarantee data comparabil-
ity and improve classicaon accuracy when compared
with non-preprocessed images (Pal & Sudeep, 2016;
Shahriar & Li, 2020; Sharma et al., 2020). The similarity
of accuracy scores for all the classiers in both training
and tesng sets, suggest that there were not overt-
ng issues in the models tested (Brownlee, 2017). Fur-
thermore, we detected dierent numbers of classes
(genera) for each of the seven classiers. For example,
the classier with the best performance, LDA, created
and recognized a total of 100 (classes proxy of genera)
from 111 genera we included in the taxon sampling.
Figure 2. Connued (see page before).
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
101Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
Figure 3. Confusion groups.
(A) Ani soptera: True (real) label
on le column and Predicted
label on the right column. Sur-
rounded by a red square (top):
the Gomphidae genus Arigom-
phus (True label) was predicted
as Hylogomphus, Progomphus,
Ophiogomphus, Stylurus and
Gomphurus (also from the Gom-
phidae family). At the boom
of the gure, surrounded by a
blue square: the genus Libel-
lula (Libellulidae), was confused
with Gomphurus (Gomphidae),
Erythemis (Libellulidae), Aeshna
(Aeshnidae) and Coryphaeshna
(Aeshnidae). (B) Zygoptera: True
(real) label on le and right col-
umn and Predicted label on the
center column. Surrounded by a
red square (top le) Coenagrio-
nidae genus Enallagma was pre-
dicted as Argia, Acanthagrion
and Cyanallagma (Also Coen-
agrionidae genera). Surrounded
by a green square,the genus
Lestes (Lesdae) was predicted
as Coenagrionidae genera Argia
and Enallagma. At the boom of
the gure, surrounded by a blue
square, genus Ischnura (Coen-
agrionidae), was predicted as
Acanthagrion, Argia, Enallagma
(all Coenagrionidae) and Lestes
(Lesdae). Illustraons from
Amanda Whispell.
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
102Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
(see Table 1, Supplementary Table 3). These dierences
may be due to class imbalance, meaning that there is
unequal representaon of genera in the dataset, with
some of them having only one individual in the data-
set (Table 1). Therefore, it is possible that during the
data paroning some groups were not included in the
training dataset, which prevents the label from being
created and in consequence, it would then not be in-
cluded in the classicaon report. Likewise, if any of
the groups were not represented in the tesng dataset
then its label would sll be created, but the values for
the metrics would be zero.
Furthermore, since machine learning algorithms de-
pend on the distribuon of classes in the training set to
esmate the probability of observing examples in each
class, class imbalance causes algorithms to learn that
less well represented classes are not as important as
the majority classes, so the performance will be bet-
ter in the laer (Brownlee, 2017). To solve this incon-
venience, an alternave could be to paron the data
set in a straed way, to ensure that all classes are bal-
anced in the training and test sets. Moreover, it is nec-
essary to increase the number of individuals in the less
represented genera.
According to Kuhn (2016), the accuracy values are
not strongly aected by the number of classes. In his
study, a comparison was made between models with
dierent numbers of classes, which ranged from three
to 26. The results of the research suggest that a greater
number of classes does not have a signicant eect on
accuracy, which slightly decreased its variaon, as the
number of classes increased, staying around 80%. Thus,
it is possible to infer that the inuence on the accuracy
of the number of classes in the present study is also
lo w.
The qualitave assessment of the confusion matrix
(Fig. 2, Supplementary Figures 1–6), reveals classica-
on mistakes in parcular taxa just by looking at wing
contours. The diagonal of the matrices shows the co-
incidences between the real and the predicted labels:
if there individuals appear along this diagonal, that
means the performance of the classier is beer, since
on this diagonal the coincidences between the true la-
bels and the predicons (true posives) will be found
(we expect a 1:1 relaonship if so). Consequently, on ei-
ther side of these true posives diagonals, classicaon
errors (false posives and false negaves) are found.
True negaves, meanwhile, correspond to all the true
instances found on the diagonal, dierent from the one
of interest (Harrington, 2012).
Unlike the present invesgaon, on which the shape
of the contour of the wings from 111 genera of drag-
onies was exclusively evaluated, and seven classiers
tested, Kuhn (2016) made a classicaon of 26 genera
of dragonies, in which characteriscs such as color,
texture and shape of the wings were included, reaching
a maximum accuracy of 91%, using only the Random
Forest algorithm classier. We suggest a possible ex-
planaon for the dierence found in accuracy between
Kuhn’s (2016) and our data is due to addional charac-
ters assessed for the dierenaon of species (texture,
coloraon and wing proporons). Our data suggests
that the contour used here by itself does not provide
enough informaon to obtain the accuracy found in
Kuhn (2016). Thus, we suggest that the combinaon of
the wing contours and the wing aributes previously
assessed by Kuhn (2016) (including a morphometric
analysis using 15 measurements, a chromac analysis
and, nally, the use of the Gabor wavelet transforma-
on on the images with dierent rotaons and scales)
might increase the accuracy of the automac idenca-
on for these taxa. In addion, we did nd that the LDA
classier has beer performance, suggesng the need
to assess other classiers than RF, which include all
the possible wing aributes to test their performance
in the classicaon. We expect to combine our results
with the previous wing aributes tested by Kuhn (2016)
for the automac idencaon to keep decreasing the
taxonomic impediment in the current biodiversity cri-
sis.
The largest number of misclassicaons of our data
are centered on the tested genera within the aniso-
pteran families Gomphidae and Libellulidae and the
zygo pteran Coenagrionidae and Lesdae families. This
is interesng as Gomphidae, Libellulidae and Coenagrio-
nidae are the most species rich families in the Odona-
ta. Our results suggest that most of the confusion and
classicaon errors are distributed among parcular
groups within families belonging to the same suborder
(Fig. 3). In parcular, there are two confusion groups
that belong to the Anisoptera suborder (Fig. 3). In the
rst group (Fig. 3A, red square), six genera of the Gom-
phidae family are included, while in the second group
(Fig. 3A, blue square), there are two genera that belong
to the Libellulidae family, two genera of the Aeshnid-
ae family and one of the Gomphidae family. Likewise,
Kuhn’s (2016) confusion matrix has similar classicaon
mistakes to the ones we observed here (Fig. 3). For ex-
ample, the genus Erythemis with the classier and ari-
butes tested by Kuhn (2016) was confused with species
of the genera Pachydiplax and Libellula; in our results it
was also confused with Libellula and a couple of aesh-
nids (Fig. 3A). For Zygoptera, our observed confusion
occurs mainly between the Lesdae and Coenagrioni-
dae families (Fig. 3B). The occurrence of greater confu-
sion within this suborder may be a consequence of the
low level of variaon in their shape between families.
This fact, in turn, underscores the need for idenca-
on of the Zygoptera facilitated by characteriscs such
as coloraon, types of joints of the veins in the wings,
paerns of venaon, presence of spines and distribu-
on of resilin patches (Appel & Gorb, 2014; Hassall,
2014).
Interesngly, our data suggest that these confusion
groups have similar wing contours, which can lead us to
look for possible hypotheses that explain these similari-
es among these taxa. Some explanaon can be due to
their ecology: for example, within the Anisoptera there
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
103Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
Table 1. Genera found by each classier and number of individuals in the dataset. The rst column (“Genus”) has the names of
the 111 genera included in the dataset. The Xs mark where the class was found and the dark gray empty cells show the classes
(genera) that were absent in the classicaon report, for each of the classiers.
Genus Count LDA LR NB CART KNN SVM RFC
Acanthagrion 16 xxxxxxx
Aeshna 131 xxxxxxx
Amphiagrion 1x
Amphipteryx 1x
Anax 39 x x x x x x x
Anisagrion 2xxxxxxx
Aphylla 33 xxxxxxx
Archilestes 20 x x x x x x x
Argia 85 x x x x x x x
Arigomphus 81 xxxxxxx
Basiaeschna 27 xxxxxxx
Boyeria 11 xxxxxxx
Brachymesia 22 x x x x x x x
Brechmorhoga 4x x x x x x x
Calopteryx 149 xxxxxxx
Cannaphila 4x x
Castoraeschna 1x
Celithemis 149 x x x x x x x
Cordulegaster 90 xxxxxxx
Cordulia 8xxxxxxx
Coryphaeschna 25 xxxxxxx
Crocothemis 2x x
Cyanallagma 3x x x x x x x
Diastatops 13 xxxxxxx
Didymops 6xxxxxxx
Dorocordulia 17 x x x x x x x
Drepanoneura 1x x x x x x x
Dromogomphus 45 xxxxxxx
Dythemis 14 xxxxxxx
Enallagma 264 xxxxxxx
Epiaeschna 22 x x x x x x x
Epipleoneura 2x
Epitheca 130 xxxxxxx
Erpetogomphus 57 xxxxxxx
Erythemis 174 x x x x x x x
Erythrodiplax 107 x x x x x x x
Euthore 1xxxxxxx
Fluminagrion 1xxxxxxx
Gomphaeschna 17 xxxxxxx
Gomphurus 160 x x x x x x x
Gynacantha 29 x x x x x x x
Hagenius 17 xxxxxxx
Helocordulia 13 xxxxxxx
Hesperagrion 15 x x x x x x x
Hetaerina 75 x x x x x x x
Heteragrion 2
Hylogomphus 62 xxxxxxx
Idiataphe 1xxxxxxx
Iridictyon 3x x x x x x x
Ischnura 83 x x x x x x x
Ladona 55 xxxxxxx
Lanthus 21 xxxxxxx
Leptobasis 2x
Lestes 187 x x
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
104Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
Genus Count LDA LR NB CART KNN SVM RFC
Leucorrhinia 70 xxxxxxx
Libellula 342 xxxxxxx
Macrodiplax 9x x x x x x x
Macromia 40 x x x x x x x
Macrothemis 43 xxxxxxx
Mecistogaster 1
Mesamphiagrion 4xxxxxxx
Miathyria 17 x x x x x x x
Micrathyria 84 x x x x x x x
Misagria 2xxxxxxx
Mnesarete 5xxxxxxx
Nannothemis 18 x x x x x x x
Nasiaeschna 14 x x x x x x x
Nehalennia 1x
Neoerythromma 1
Neoneura 1x
Nephepela 6x x x x x x x
Neurocordulia 7x x x x x x x
Octogomphus 9xxxxxxx
Oligoclada 1x
Ophiogomphus 88 x x x x x x x
Oplonaeschna 1
Orthemis 35 xxxxxxx
Pachydiplax 68 xxxxxxx
Palaemnema 3x
Paltothemis 14 x x x x x x x
Pantala 85 x x x x x x x
Perithemis 81 xxxxxxx
Phanogomphus 233 xxxxxxx
Phyllocycla 17 x x x x x x x
Phyllogomphoides 29 x x x x x x x
Plathemis 88 xxxxxxx
Polythore 220 xxxxxxx
Progomphus 62 xxxxxxx
Protoneura 4x x x x x x x
Pseudoleon 18 x x x x x x x
Remarnia 1xxxxxxx
Rhionaeschna 3
Rhodopygia 2x x x x x x x
Rimanella 1x x x x x x x
Somatochlora 55 x x x x x x x
Staurophlebia 1xxxxxxx
Stenocora 1x x
Stenogomphurus 7x x x x x x x
Stylogomphus 18 x x x x x x x
Stylurus 38 xxxxxxx
Sympetrum 146 xxxxxxx
Tachopteryx 10 x x x x x x x
Tanypteryx 3x x x x x x x
Tauriphila 21 x x x x x x x
Telebasis 6xxxxxxx
Tholymis 9xxxxxxx
Tramea 86 x x x x x x x
Triacanthagyna 13 x x x x x x x
Tuberculobasis 1
Uracis 9xxxxxxx
Zenithoptera 4xxxxxxx
Total Not found 11 19 19 12 17 19 19
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
105Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
are an array of ight behaviors (iers, gliders and perch-
ers; Corbe & May, 2008) and these ight styles can
be reected in the similaries found in wing contours
within both our observed confusion groups. For exam-
ple, in migratory species of libellulids’ hindwings can
show convergence towards a wing planform that favors
the gliding ight as an energy saving strategy (Suarez-
Tovar & Sarmiento; 2016). For zygopterans, their ight
is more passive, and their ability to disperse might be
associated with slow ight or overight (Bomphrey et
al., 2016), which would explain any similaries in wing
contours for coenagrionids and lesds. Comparisons
of the damping raos and natural frequencies of two
dragony and two damsely species, shows that for the
anisopterans damping properes between fore- and
hindwings were signicantly dierent, while in zygo-
pterans there were no or very weak dierences in the
damping raos between both wings, suggesng that
the structural design and wing shape can inuence
the aerodynamics of their ight behaviors (Lietz et al;
2021). In addion, funconal morphology traits of the
wings, like types of joints of the wing veins, spines and
presence of resilin, a protein that gives certain exibility
to the wings of insects can be evaluated in this groups,
like previously done by Appel and Gorb (2014) to un-
derstand the wing contour similaries in these taxa.
Overall, our results suggest that the wing contours by
themselves can discriminate with a moderate accuracy
and precision, in comparison with other wing aributes
obtained using high resoluon images. In addion, we
tested mulple classifying algorithms for the contours,
where LDA had the best performance.
Acknowledgements
The authors would like to acknowledge the funding from NSF Grant
#1564386: ODOMATIC: Automac Species Idencaon, Funcon-
al Morphology, and Feature and NSF DBI WK Postdoctoral Grant
#16116642: Leveraging face-detecon methods to idenfy insects
from eld photos, automacally.
References
Anaconda Soware Distribuon. (2016). hps://anaconda.com
Appel, E. & Gorb, S. N. (2014). Zoologica Comparave funconal
morphology of vein joints in Odonata. Zoologica, 159.
Blidh, H. (2016). Python implementaon of “Ellipc Fourier Features
of a Closed Contour.” hps://pyefd.readthedocs.io/en/latest/
Bomphrey, R. J., Nakata, T., Henningsson, P. & Lin, H. T. (2016). Flight
of the dragonies and damselies. Philosophical Transacons of
the Royal Society B: Biological Sciences, 371(1704). doi:10.1098/
rstb.2015.0389
Brownlee, J. (2017). Master Machine Learning Algorithms (1.12).
Machine Learning Mastery. hps://machinelearningmastery.
com/master-machine-learning-algorithms/
Ceballos, G., Ehrlich, P. R., Barnosky, A. D., García, A., Pringle, R. M.
& Palmer, T. M. (2015). Accelerated modern human-induced spe-
cies losses: Entering the sixth mass exncon. Science Advances,
1, 5. doi:10.1126/sciadv.1400253
Corbe, P. S. & May, M. L. (2008). Fliers and perchers among Odona-
ta: dichotomy or muldimensional connuum? A provisional re-
appraisal. Internaonal Journal of Odonatology, 11(2), 155–171.
doi:10.1080/13887890.2008.9748320
Córdoba-Aguilar, A. (2008). Dragonies and Damselies: Model Or-
ganisms for Ecological and Evoluonary Research. In Dragonies
and Damselies: Model Organisms for Ecological and Evoluonary
Research. doi:10.1093/acprof:oso/ 9780199230693.001.0001
Gandhi, R. (2018). Support Vector Machine—Introducon to Ma-
chine Learning Algorithms. Towards Data Science. hps://to-
wardsdatascience.com/support-vector-machine-introduction-
to-machine-learning-algorithms-934a444fca47
Garrison, Rosser W., von Ellenrieder, N. & Louton, J. A. (2006). Drag-
ony genera of the New World: an illustrated and annotated key
to the Anisoptera. In Choice Reviews Online. Johns Hopkins Uni-
versity Press.
González, A. (2009). El conocimiento sistemáco impedimento ta-
xonómico la biodiversidad y. Revista de La Sociedad Española de
Biologia Evoluva, 4(1), 19–32.
Harrington, P. (2012). Machine Learning in Acon Ill MANNING Shel-
ter Island. Manning Publicaons Co.
Hassall, C. (2014). Connental variaon in wing pigmentaon in ca-
lopteryx damselies is related to the presence of heterospecics.
PeerJ, 2014(1), e438. doi:10.7717/peerj.438
Heckman, C. W. (2006). Encyclopedia of South American Aquac In-
sects: Odonata – Anisoptera. In Encyclopedia of South American
Aquac Insects: Odonata – Anisoptera. The Netherlands: Spring-
er. doi:10.1007/978-1-4020-4802-5
Heckman, C. W. (2008). Encyclopedia of South American Aquac In-
sects: Odonata – Zygoptera. In Encyclopedia of South American
Aquac Insects: Odonata – Zygoptera. The Netherlands: Spring-
er. doi:10.1007/978-1-4020-8176-7
Kalkman, V. J., Clausnitzer, V., Dijkstra, K. D. B., Orr, A. G., Paulson,
D. R. & van Tol, J. (2008). Global diversity of dragonies (Odona-
ta) in freshwater. Hydrobiologia, 595(1), 351–363. doi:10.1007/
s10750-007-9029-x
Koh, L. P., Dunn, R. R., Sodhi, N. S., Colwell, R. K., Proctor, H. C. &
Smith, V. S. (2004). Species Coexncons and the Biodiversity
Crisis. Science, 305(September), 1632–1635. doi:10.1126/sci-
ence.1101101
Kuhn, W. R. (2016). Three approaches to automang taxonomy,
with emphasis on the Odonata (dragonies and damselies).
(Thesis). Rutgers, The State University of New Jersey.
la Salle, J., Williams, K. J. & Moritz, C. (2016). Biodiversity analysis in
the digital era. Philosophical Transacons of the Royal Society B:
Biological Sciences, 371(1702). doi:10.1098/rstb.2015.0337
Lietz, C., Schaber, C.F., Gorb, S.N. et al. (2021) The damping and
structural properes of dragony and damsely wings during
dynamic movement. Commun Biol 4, 737. doi:10.1038/s42003-
021-02263-2
Lorenzo-Carballa, M. O. & Cordero Rivera, A. (2012). Odonatos. In
P. Vargas & R. Zardoya (Eds.), El árbol de la Vida: sistemáca y
evolución de los seres vivos. pp. 293–301.
Moore, N. W. (1997). Dragonies: Status Survey and Conservaon
Acon Plan. Gland, Switzerland, and Cambridge, UK: IUCN.
Narayan, Y. (2021). Hb vsEMG signal classicaon with me domain
and Frequency domain features using LDA and ANN classier.
Materials Today: Proceedings, 37, 3226–3230. doi:10.1016/j.
matpr.2020.09.091
Pal, K. K. & Sudeep, K. S. (2016). Preprocessing for image classica-
on by convoluonal neural networks. 2016 IEEE Internaonal
Conference on Recent Trends in Electronics, Informaon & Com-
municaon Technology (RTEICT), 1778–1781. doi:10.1109/RTEI-
CT.2016.7808140
Sáenz Oviedo, Kuhn, ... & Sanchez-Herrera Are wing contours good classiers for automac idencaon in Odonata?
106Internaonal Journal of Odonatology │ Volume 25 │ pp. 96–106
Paulson, D. R., Schorr, M. & Deliry, C. (2021). World Odonata List
· University of Puget Sound. hps://www2.pugetsound.edu/
academics/academic-resources/slater-museum/biodiversity-
resources/dragonies/world-odonata-list2/
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B.,
Grisel, O., .... Duchesnay, E. (2011). Scikit-learn: Machine Learn-
ing in Python. Journal of Machine Learning Research, 12(85),
2825–2830.
Raybaut, P. (2009). Spyder IDE (4.2.1). Pythonhosted. hps://www.
spyder-ide.org/
Samways, M. J. & Steytler, N. S. (1996). Dragony (Odonata) dis-
tribuon paerns in urban and forest landscapes, and recom-
mendaons for riparian management. Biological Conservaon.
doi:10.1016/S0006-3207(96)00032-8
Shahriar, M. T. & Li, H. (2020). A Study of Image Pre-processing for
Faster Object Recognion. ArXiv, October 2020. hps://arxiv.
org/abs/2011.06928
Sharma, P., Hans, P. & Gupta, S. C. (2020). Classicaon of plant leaf
diseases using machine learning and image preprocessing tech-
niques. Proceedings of the Conuence 2020 – 10th Internaonal
Conference on Cloud Compung, Data Science and Engineering,
480–484. doi:10.1109/Conuence47617.2020.9057889
Suárez-Tovar, C. M. & Sarmiento, C. E. (2016), Beyond the wing plan-
form: morphological dierenaon between migratory and non-
migratory dragony species. Journal of Evoluonary Biology, 29,
690-703. doi:10.1111/jeb.12830
Tharwat, A., Gaber, T., Ibrahim, A. & Hassanien, A. E. (2017). Linear
discriminant analysis: A detailed tutorial. AI Communicaons,
30(2), 169–190. doi:10.3233/AIC-170729
van Rossum, G., & Drake Jr, F. L. (2009). Python 3 Reference Manual.
CreateSpace. hps://dl.acm.org/doi/book/10.5555/1593511
Supplementary material
Supplementary Figure 1. Random Forest Classier Confusion Matrix.
Supplementary Figure 2. Support Vector Machines Confusion Matrix.
Supplementary Figure 3. K-Nearest Neighbors Confusion Matrix.
Supplementary Figure 4. Classicaon and Regression Trees Confu-
sion Matrix.
Supplementary Figure 5. Naïve Bayes Confusion Matrix.
Supplementary Figure 6. Logisc Regression Confusion Matrix.
Supplementary Table 1. Taxonomic informaon of specimens in-
cluded in the analysis.
Supplementary Table 2. Training accuracy scores.
Supplementary Table 3. Summary of classicaon report: Number
of classes found, accuracy, precision, recall, F1 score and support
values of the classiers tested.
Supplementary Table 4. ANOVA results for accuracy scores compari-
son.
Supplementary Table 5. Tukey mulple comparisons test for accu-
racy scores.
Supplementary Table 6. ANOVA results for F1 scores comparison.
Supplementary Table 7. Tukey mulple comparisons test for F1
scores.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
For flying insects, stability is essential to maintain the orientation and direction of motion in flight. Flight instability is caused by a variety of factors, such as intended abrupt flight manoeuvres and unwanted environmental disturbances. Although wings play a key role in insect flight stability, little is known about their oscillatory behaviour. Here we present the first systematic study of insect wing damping. We show that different wing regions have almost identical damping properties. The mean damping ratio of fresh wings is noticeably higher than that previously thought. Flight muscles and hemolymph have almost no 'direct' influence on the wing damping. In contrast, the involvement of the wing hinge can significantly increase damping. We also show that although desiccation reduces the wing damping ratio, rehydration leads to full recovery of damping properties after desiccation. Hence, we expect hemolymph to influence the wing damping indirectly, by continuously hydrating the wing system.
Article
Full-text available
The surface electromyography (sEMG) signals have been widely employed for the development of the human–machine interface and have enormous bio-engineering applications. Movement identification of SEMG signal plays a significant factor in the designing of human assistive robotic devices. This work compared the time domain (TD) and frequency domain (FD) features by using linear discriminant analysis (LDA) and artificial neural network (ANN) classifiers for six different hand movements’ identification. Discrete Wavelet Transform is employed mainly for de-noising the sEMG signal before the feature extraction. Finally, a feature vector is formed which consists of all TD and FD features for classification purpose. ANN exhibited96.4% accuracy and found better as compared to the LDA classifier whereas classification accuracy of LDA classifier was found 94.5%. The resultexhibits that ANN has a greater ability for sEMG signal classification as compared to LDA classifier and further suggested to design the assistive robotic technology.
Article
Full-text available
Linear Discriminant Analysis (LDA) is a very common technique for dimensionality reduction problems as a pre-processing step for machine learning and pattern classification applications. At the same time, it is usually used as a black box, but (sometimes) not well understood. The aim of this paper is to build a solid intuition for what is LDA, and how LDA works, thus enabling readers of all levels be able to get a better understanding of the LDA and to know how to apply this technique in different applications. The paper first gave the basic definitions and steps of how LDA technique works supported with visual explanations of these steps. Moreover, the two methods of computing the LDA space, i.e. class-dependent and class-independent methods, were explained in details. Then, in a step-by-step approach, two numerical examples are demonstrated to show how the LDA space can be calculated in case of the class-dependent and class-independent methods. Furthermore, two of the most common LDA problems (i.e. Small Sample Size (SSS) and non-linearity problems) were highlighted and illustrated, and state-of-the-art solutions to these problems were investigated and explained. Finally, a number of experiments was conducted with different datasets to (1) investigate the effect of the eigenvectors that used in the LDA space on the robustness of the extracted feature for the classification accuracy, and (2) to show when the SSS problem occurs and how it can be addressed.
Article
Full-text available
This work is a synthesis of our current understanding of the mechanics, aerodynamics and visually mediated control of dragonfly and damselfly flight, with the addition of new experimental and computational data in several key areas. These are: the diversity of dragonfly wing morphologies, the aerodynamics of gliding flight, force generation in flapping flight, aerodynamic efficiency, comparative flight performance and pursuit strategies during predatory and territorial flights. New data are set in context by brief reviews covering anatomy at several scales, insect aerodynamics, neuromechanics and behaviour. We achieve a new perspective by means of a diverse range of techniques, including laser-line mapping of wing topographies, computational fluid dynamics simulations of finely detailed wing geometries, quantitative imaging using particle image velocimetry of on-wing and wake flow patterns, classical aerodynamic theory, photography in the field, infrared motion capture and multi-camera optical tracking of free flight trajectories in laboratory environments. Our comprehensive approach enables a novel synthesis of datasets and subfields that integrates many aspects of flight from the neurobiology of the compound eye, through the aeromechanical interface with the surrounding fluid, to flight performance under cruising and higher-energy behavioural modes. This article is part of the themed issue ?Moving in a moving medium: new perspectives on flight?.
Article
Full-text available
This paper explores what the virtual biodiversity e-infrastructure will look like as it takes advantage of advances in ‘Big Data’ biodiversity informatics and e-research infrastructure, which allow integration of various taxon-level data types (genome, morphology, distribution and species interactions) within a phylogenetic and environmental framework. By overcoming the data scaling problem in ecology, this integrative framework will provide richer information and fast learning to enable a deeper understanding of biodiversity evolution and dynamics in a rapidly changing world. The Atlas of Living Australia is used as one example of the advantages of progressing towards this future. Living in this future will require the adoption of new ways of integrating scientific knowledge into societal decision making. This article is part of the themed issue ‘From DNA barcodes to biomes’.
Book
How to write your own machine learning algorithms in Python.
Conference Paper
In recent times, the Convolutional Neural Networks have become the most powerful method for image classification. Various researchers have shown the importance of network architecture in achieving better performances by making changes in different layers of the network. Some have shown the importance of the neuron's activation by using various types of activation functions. But here we have shown the importance of preprocessing techniques for image classification using the CIFAR10 dataset and three variations of the Convolutional Neural Network. The results that we have achieved, clearly shows that the Zero Component Analysis(ZCA) outperforms both the Mean Normalization and Standardization techniques for all the three networks and thus it is the most important preprocessing technique for image classification with Convolutional Neural Networks.