ArticlePDF Available

Biological shape characterization for automatic image recognition and diagnosis of protozoan parasites of the genus Eimeria


Abstract and Figures

We describe an approach of automatic feature extraction for shape characterization of seven distinct species of Eimeria, a protozoan parasite of domestic fowl. We used digital images of oocysts, a round-shaped stage presenting inter-specific variability. Three groups of features were used: curvature characterization, size and symmetry, and internal structure quantification. Species discrimination was performed with a Bayesian classifier using Gaussian distribution. A database comprising 3891 micrographs was constructed and samples of each species were employed for the training process. The classifier presented an overall correct classification of 85.75%. Finally, we implemented a real-time diagnostic tool through a web interface, providing a remote diagnosis front-end.
Content may be subject to copyright.
Pattern Recognition 40 (2007) 18991910
Biological shape characterization for automatic image recognition and
diagnosis of protozoan parasites of the genus Eimeria
César A.B. Castañón a,b, Jane S. Fragaa, Sandra Fernandeza, Arthur Grubera,,
Luciano da F. Costab,∗∗
aInstituto de Ciˆencias Biomédicas, Departmento de Parasitologia, Universidade de São Paulo, Av. Prof. Lineu Prestes 1374, São Paulo SP, 05508-000, Brazil
bInstituto de Física de São Carlos, Universidade de São Paulo, Caixa Postal 369, São Carlos SP, 13560-970, Brazil
Received 21 July 2006; received in revised form 21 November 2006; accepted 6 December 2006
We describe an approach of automatic feature extraction for shape characterization of seven distinct species of Eimeria, a protozoan parasite
of domestic fowl. We used digital images of oocysts, a round-shaped stage presenting inter-specific variability. Three groups of features were
used: curvature characterization, size and symmetry, and internal structure quantification. Species discrimination was performed with a Bayesian
classifier using Gaussian distribution. A database comprising 3891 micrographs was constructed and samples of each species were employed
for the training process. The classifier presented an overall correct classification of 85.75%. Finally, we implemented a real-time diagnostic
tool through a web interface, providing a remote diagnosis front-end.
2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Keywords: Shape analysis; Feature extraction; Pattern classification; Image processing; Remote diagnosis; Real-time systems; Eimeria; Avian coccidiosis
1. Introduction
An important goal in image analysis is to classify and recog-
nize objects of interest in digital images. Objects can be charac-
terized in several ways, e.g. by identifying their colors, textures,
shapes, movements, and position within images. No ubiquitous
approach is currently available to resolve pattern recognition
problems for different domains of images. In order to construct
a model for object characterization and posterior classification,
one needs to do a previous analysis of the domain of images.
Some applications of pattern recognition for biological prob-
lems, specifically for diagnosis purposes, have been reported
in the literature. Comaniciu et al. [1] developed an image re-
trieval system to discriminate between malignant lymphomas
Supplementary material is available at
Corresponding author. Tel.: +55 11 30917274; fax: +55 11 30917417.
E-mail addresses: (C.A.B. Castañón), (J.S. Fraga),
(S. Fernandez), (A. Gruber),
(L. da F. Costa).
∗∗Also for correspondence.
0031-3203/$30.00 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
and chronic lymphocytic leukemia, using descriptors for tex-
tural and shape characterization. A similar work, developed by
Sabino et al. [2] for Leukemia diagnosis, was based on textural
identification through gray level co-occurrence matrices. Jalba
et al. [3] proposed another interesting approach for automatic
diatom identification, based on contour analysis by construct-
ing a morphological curvature scale space for feature extrac-
tion. Other works adopted a Gaussian multivariate analysis for
the identification of bacterial types [4], recognition of culture
cells [5], and classification of chromosome images [6].
A particularly interesting application field for implementing
image-based identification algorithms is parasite diagnosis.
Parasites have been classically discriminated and identified
through non-automated morphological analysis, among other
methods. Since many parasitic organisms present developmen-
tal stages that have a well-defined and reasonably homoge-
neous morphology, they are amenable to pattern recognition
techniques. Eimeria, a genus comprising pathogenic proto-
zoan parasites, has been used in several image analysis studies
[7–9]. A total of seven distinct Eimeria species can infect
the domestic fowl, causing an economically relevant disease
known as coccidiosis [10]. Because different species vary in
1900 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910
Fig. 1. Photomicrographs of oocysts of the seven Eimeria species of domestic fowl. Samples: (a) E. maxima, (b) E. brunetti, (c) E. tenella, (d) E. necatrix,
(e) E. praecox, (f) E. acervulina, and (g) E. mitis.
pathogenicity and virulence, their precise discrimination is im-
portant for epidemiological studies and disease control mea-
sures. Parasite oocysts, a round-shaped developmental stage,
are shed in profuse amounts in the feces of infected chicks.
Oocysts of distinct species present differences of size (area,
diameter), contour (elliptic, ovoid, circular), internal structure,
thickness and color of the oocyst wall, among other morpho-
logical variations (Fig. 1). However, the correct species dis-
crimination by human visual inspection is severely restricted by
the slight morphological differences that exist among the dis-
tinct species and the overlap of characteristics. Considering the
limitations imposed by morphology-based diagnosis, different
molecular approaches have been devised for species discrimi-
nation, such as a PCR-based diagnostic assay using the ribo-
somal ITS1 as a target [11,12]. Our group has also developed
molecular diagnostic tools for Eimeria spp., including a mul-
tiplex PCR assay for the simultaneous diagnosis of the seven
species that infect the domestic fowl [13]. These molecular
diagnosis assays are very sensitive and specific, but require
highly trained personnel and sophisticated infrastructure.
Previous works have reported the differentiation of Eime-
ria [7–9] and helminths [14] using digital image recognition.
Kucera and Reznicky [7] reported the species differentiation of
Eimeria spp. of domestic fowl using only two features, length
and width of oocysts, which were computed in a semiauto-
matic fashion. Such a limited number of characters, however,
restricted the ability to differentiate all seven species due to
the similar morphology and overlap among the distinct species.
Sommer [15,16], working with cattle Eimeria, used a more
complex approach, where the parametric contour was consid-
ered as input to compute the amplitude of the Fourier trans-
form. Nevertheless, the classification method (average linkage
clustering) does not consider the distribution of elements and is
not particularly suitable for real-time systems. Yang et al. [17]
developed an automatic system for human helminth egg detec-
tion and classification using artificial neural networks (ANNs).
The authors followed the work developed by Sommer [15],
where the parametric contour of the object was used to com-
pute the amplitude of the Fourier transform. Cross-validation
results showed correct classification rates of 86.1–90.3%, but
the small number of samples utilized severely restricted an es-
timation of the confidence level of the approach. Another work
using ANNs for object detection was described by Widmer
et al. [18] for the identification of Cryptosporidium parvum.
The authors differentiated parasite oocysts from sample debris
with success, but no species differentiation was conducted.
The small number of features utilized in these previous works
can be explained by the difficulty of quantifying morphologi-
cal features. This limitation, together with the high complexity
of the algorithms, makes the development of real-time systems
for automatic diagnosis a challenging task. In addition, the set
of features to be used is strongly dependent on the characteris-
tics of the image domain. In this regard, our group has reported
several techniques for shape characterization. Thus, Bruno
et al. [19] used multiscale features for the characterization
of cat ganglion neural cells, whereas Coelho et al. [20] pro-
posed another set of features (diameter, eccentricity, fractal
dimension, influence histogram, influence area, convex hull
area and convex hull diameter) for the same problem. Costa
et al. [21] used digital curvature as a feature for morphological
characterization and classification of landmark shapes.
In this paper we present an approach to extract morphological
information by using different computer vision techniques in
order to perform an automatic species differentiation of Eime-
ria spp. oocysts. We report the development of a shape repre-
sentation approach that considers three types of morphologi-
cal characteristics: (a) multiscale curvature, (b) geometry, and
(c) texture. All these features are automatically extracted con-
stituting a 13D (13-dimensional) future vector for each oocyst
While the considered measurements and adopted classifica-
tion methods used throughout this work are not necessarily
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910 1901
novel, their combined application resulted in an operational
framework that was extensively validated for Eimeria classifi-
cation. To our knowledge, this is the most complete example of
a system that implements pattern recognition methods for par-
asite diagnosis and fully integrates them with a user friendly
web interface. We believe that this work may represent a new
paradigm for parasitological diagnosis.
The article is organized as follows. In Section 2, an overall
view of the system is presented. In Section 3 we discuss the
different techniques and the methodology for shape character-
ization, while the process and techniques for species classifi-
cation and image similarity are presented in Section 4. Section
5 reports the results of species differentiation and the develop-
ment of a real-time automatic species discrimination system.
Sections 6 and 7 present a discussion and conclusions of this
work, respectively.
2. An overview of the diagnosis system based in automatic
shape characterization
Fig. 1 presents oocyst photomicrographs of the seven Eime-
ria species of domestic fowl. As can be seen, the different
species vary in terms of size, shape, and internal structure.
In order to identify the species that corresponds to a specific
oocyst image, we initially developed a mathematical model to
characterize oocyst morphology, and then applied it for species
discrimination. The oocyst analysis and recognition pro-
cess reported here include three components: (a) image pre-
processing, (b) feature extraction, and (c) pattern recognition.
The image pre-processing stage defines the boundary of the
object to be processed. This boundary is determined by the
parametric contour of the oocyst, which is a bi-dimensional
vector (x, y ) representing the localization of each pixel in the
contour. The feature extraction step uses the parametric contour
of the oocysts as an input vector. We used in total 13 features
describing curvature, geometrical measures and texture. These
characteristics constitute the feature vector of the oocyst image,
which in turn is stored in the feature database and provides the
input data for the pattern recognition stage.
The last analysis stage comprises the pattern recognition or
pattern classification. For this task, the classifier is submitted
to a training process using known observations, a training set,
and the necessary statistics. A class of patterns is typically rep-
resented as a probability density function (pdf) of features. In
this case, a simple model like a single Gaussian distribution is
used to represent each of the patterns. This Gaussian function
provides the basis for the multidimensional Bayesian classifier,
whose decision regions provides the basis for the species iden-
3. Shape characterization
Images can be mathematically understood as sets of
connected points in a bidimensional space F, that can be
approximated in a discrete binary image space. Image classifi-
cation, performed directly on F, is a hard task that may require
O(N2)comparisons, assuming that each image has Npixels.
Table 1
Geographic origin of the Eimeria strains and species used in this study and
the respective number of image samples
Name Origin Samples
E. acervulina H Houghton, England 374
E. acervulina 103 São Paulo, Brazil 114
E. acervulina R7 Santa Catarina, Brazil 148
E. brunetti C São Paulo, Brazil 418
E. maxima H Houghton, England 103
E. maxima L São Paulo, Brazil 91
E. maxima 50 São Paulo, Brazil 127
E. mitis RT Czech Republic 335
E. mitis 30 São Paulo, Brazil 199
E. mitis 44 São Paulo, Brazil 223
E. necatrix DF São Paulo, Brazil 259
E. necatrix 103 São Paulo, Brazil 145
E. praecox H Houghton, England 377
E. praecox 1D1A São Paulo, Brazil 180
E. praecox D USA 190
E. tenella H Houghton, England 311
E. tenella CR Czech Republic 137
E. tenella MC São Paulo, Brazil 160
The representation of an image can be modified by applying
suitable image transformation (IT) mapping from Ftoanew,
and typically smaller, feature space F. This means that most
of the classification-related information is “squeezed” in a rel-
atively small number of particularly informative features, lead-
ing to a reduction of the necessary feature space dimension.
The basic reasoning behind transform-based features is that
some chosen “sets of filters” [22] can exploit and remove in-
formation redundancies that normally occur in natural images
[23]. Shape can be represented either by its contour or by its
region [24,25]. Global contour-based descriptors can be com-
puted from the shape boundary. In the present work, three sets
of features are used: (a) shape analysis tools based on the mul-
tiscale Fourier transform-based approach to curvature estima-
tion, (b) geometrical measurements, and (c) features for texture
3.1. Biological samples
Parasite samples of each one of the seven Eimeria species
that infect the domestic fowl were used throughout this work. In
addition, whenever available, we used multiple strains of each
species, collected from different geographic sources (Table 1).
The parasites were propagated in three-week old chicks and
oocysts were isolated following standard protocols [10].
3.2. Image pre-processing
We used oocyst micrographs as the starting point for the au-
tomatic analysis. The pictures were obtained with an optical mi-
croscope (Nikon Eclipse E800) coupled to a 4-megapixel CCD
camera (Nikon Coolpix 4500). The images were captured with
a40×magnification objective and saved as 24-bit JPEG (fine
quality option) files. Using these conditions, all pictures pre-
sented a spatial resolution of 11.1pixels/m. Depending on the
1902 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910
Fig. 2. Different stages of oocyst image pre-processing. An original color image is firstly converted into (a) a gray scale image. After segmentation, the
resulting (b) binarized image is used for (c) contour detection.
sample concentration and purity, a single micrograph could pro-
vide many oocysts for further processing. Low quality oocyst
images were not considered for downstream analysis. Common
practical problems included out of focus images, oocysts not
adequately well positioned, and atypical oocyst morphologies
caused by accidental cracking or squeezing. Another problem-
atic aspect we observed is related to the presence of debris
and bacteria in dirty samples, thus complicating object seg-
mentation. The process of oocyst image isolation was carried
out manually by using an image processing program (Gimp
or Adobe Photoshop). The objects of interest, single oocysts,
were cropped out of the picture and used to create new im-
age files that were in turn used as input data to our system.
A total of 3891 oocyst images constituted the data set of the
present work.
Image quality can be substantially heterogeneous due to dif-
ferences of illumination, contrast, focus, acquisition resolution,
thus hampering object detection. To reduce the effect of illu-
mination variations, we equalized the images through the his-
togram specification method [26], considering as eigenimage a
prototype computed previously for each species from the train-
ing set. For object segmentation, we applied a thresholding
approach [26] with a cut-off value manually determined for
each image. As a result, binary images were produced with the
respective object being defined by black pixels on a background
of white pixels. The steps of converting an original color oocyst
image into a parametric contour are depicted in Fig. 2.
The binarized images (see Fig. 2b) are submitted to an al-
gorithm that extracts the external contour of the object. This is
done by selecting an initial point belonging to the contour of
the object. The algorithm involves successive detections of the
next contour pixel by using chain-code directions. The result is
a parametric representation, where every point in the contour
is identified by coordinates x(t) and y(t) [25].
3.3. Curvature based on multiscale Fourier transformation
The curvature of an object is an important characteristic
that can be extracted from the respective contour. The pioneer
work of Attneave [27] emphasized the importance that tran-
sient events and asymmetries have in human visual perception,
thus influencing the subsequent research on shape in computer
vision. Riggs [28], for instance, postulated that curvature detec-
tors would be present at the neuronal level in humans. Due to
its biological motivation, curvature analysis has gained atten-
tion from the pattern recognition community, and many meth-
ods have been proposed to compute it [29].
Our approach takes advantage of the closed parametric con-
tour that is represented by the x(t) and y(t) signals, which are
used for curvature estimation using the Fourier derivative prop-
erty [30]. Let the parametric representation of the contour be
c(t) =(x (t), y (t)) (1)
the curvature k(t) of c(t) is defined as
k(t) =˙x(t)¨y(t) −¨x(t)˙y(t)
(˙x(t)2y(t)2)3/2, (2)
where ˙xand ˙yare the first derivatives, ¨xand ¨yare the second
derivatives, of the signals x(t) and y(t), respectively. Those val-
ues can be easily computed using the Fourier derivative prop-
erty [25].
Using an arc length parameterization, and convolving the
original contour signal (t) with derivatives of Gaussian function,
with varying standard deviation a, then derived from Eq. (2),
the multiscale curvature is defined as described by Mokhtarian
et al. [29]:
k(t, a) x(t, a)¨y(t,a) −¨x(t,a)˙y(t,a). (3)
The multiscale approach to curvature estimation leads to
the so-called curvegram, where the curvature values appear as
a scale-space representation. Fig. 3 shows the contour of an
oocyst (panel a) and its corresponding curvegram (panel b).
Gaussian smoothing is essential for controlling curvature insta-
bilities caused by noise along the contour (t), which would oth-
erwise produce many peaks of variable height. The smoothing
level is determined by the standard deviation aof the Gaussian
function. A small avalue (Fig. 3b, a=10) results in a noisy
curvature, whereas a higher value yields a smoother curvature
(Fig. 3c, a=50). This effect can be better observed in a 3D
curvegram that includes different scale values (Fig. 3d).
While the curvature itself can be used as a feature vector, this
approach presents some serious drawbacks, including the fact
that the curvature signal can be too large (involving thousands
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910 1903
Fig. 3. An oocyst contour (a) and the corresponding curvegrams using gamma values of 10 (b) and 50 (c), or a range of standard deviation values for Gaussian
function, displayed in a 3D curvegram (d).
of points, depending on the contour) and highly redundant.
Once the curvature has been estimated, the following shape
measures [25] can be calculated in order to circumvent these
problems: sampled curvature, curvature statistics (mean, me-
dian, variance, standard deviation, entropy higher moments,
etc.), maxima, minima, inflection points, and bending energy.
3.4. Geometrical measurements
Some oocyst species present distinctive characterization
based only in shape and size, making necessary to find out ad-
ditional features to characterize them. For instance, principal
component analysis [25] was applied in order to find the main
directional vectors (eigenvectors), and used to define some
measurements such as diameters and symmetry. We used the
bilateral symmetry, that is considered a primary case from a
geometric concept of symmetry [31]. Considering a binary
image, the shape is reflected with respect to the orientations
being defined by its major axis to find a bilateral symmetry
degree, the same process is applied with respect to the minor
axis [25]. Some additional measurements related to symmetry
have also been described in the literature [32–36].
In the present work, we considered the diameters (major and
minor axis) and symmetry of the oocysts. Simple global de-
scriptors included area (number of pixels into region), eccen-
tricity (length of major axis/length of minor axis), circularity
(perimeter2/area), and bending energy [37].
3.5. Texture characterization based on co-occurrence matrices
The several methods for texture analysis have been classified
by Tuceryan and Jain [38] into four categories: statistical, geo-
metrical, model based and signal processing based. A powerful,
frequently used method, involves the so-called co-occurrence
matrices [39]. This method provides a second-order approach
for generating texture features. Although mainly applied to tex-
ture discrimination of images, co-occurrence matrices have also
been used for region segmentation [40].
1904 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910
The co-occurrence matrices take into account information
about the relative positions of the various gray levels within
the image. There are two parameters used for computing
co-occurrence matrices: (a) the relative distance among the
pixels and (b) their relative orientation. They involve the con-
ditional joint probabilities, Cij , of all pairwise combination of
gray levels given the inter-pixel displacement vector (x,y),
which represents the separation of the pixel pairs in the x- and
y-directions, respectively. Traditionally, the probabilities are
stored in a gray level co-occurrence matrix (GLCM) [39,41].
This resultant matrix is a second-order histogram from which
some information can be extracted [42]: angular second mo-
ment, contrast, inverse difference moment and entropy.
4. Pattern classification
4.1. Bayesian classifier
Classification is always performed with respect to some prop-
erties (or features) of the objects. Indeed, the fact that objects
share the same property defines an equivalence relation in terms
of partitioning of the object space. In this sense, a sensible clas-
sification operates in such a way as to group together into the
same “class” entities that share some properties, while distinct
classes are assigned to entities with distinct properties. We use
the term “classifier” for each statistical tool, trained using a
specific data set, to discriminate distinct “classes”.
A Bayesian classifier [43] utilizes a probabilistic approach
for classification. It can be used to compute the probability
that an example xbelongs to class i. The computer imple-
mentation is facilitated using the multivariate normal density
function, which is entirely defined by two parameters: the
mean iand covariance matrix i. Although the Bayesian
decision rule is not a discriminant function, it defines regions
that can be expressed in terms of discriminant functions gi(x).
To classify a new element xinto one of the iclasses, we take
the highest value of the gifunctions as the corresponding true
4.2. Algorithm for the partition process and classifier
Aiming at obtaining a robust and reliable classifier, we devel-
oped an algorithm (Algorithm 1) to select the best combination
of features, evaluate the most adequate size of the training set,
and evaluate the classification accuracy.
For each class, the corresponding data set was randomly di-
vided into two groups, the training and the test sets. Different
proportions of these sets were tested using intervals defined by
integers (e.g. from 10:90 to 90:10). In addition, for each train-
ing:test proportion we generated a user-defined number of ran-
domly selected paired sets, which were evaluated independently
to reduce possible sampling biases. Each set was then evaluated
with respect to its ability to correctly classify. The average of
the classification scores, obtained for each of these paired sets,
was considered as the final score of correct classification for that
particular proportion of training:test sets. This approach was
recursively applied to the different training:test set percentages.
Finally, the classification matrix was calculated as the average
of all confusion matrices resulting of each training:test partition.
Algorithm 1.C
Require:Nc # of classes;
Require:Nf # of features;
Require:%training % of training set;
Require:%test % of test set;
Require:NrandomPartitions # of random sets;
Require:LC # of learning cycles;
1: set MclassAux[ ][ ] with zeros;
2: for i=1toNrandomPartitions do
3: [TrainingSet,TestSet]=PARTITION(Dataset,%training,
4: Mclass =BAYESIANCLASSIFIER(TrainingSet,TestSet,
N c, Nf , LC);
5: MclassAux =MclassAux +Mclass;
6: end for
7: MclassMean =MclassAux/NrandomPartitions;
8: return MclassMean;
The procedure requires a DataSet with a defined number of
classes (Nc) and a number of features (Nf). The partition is
defined by the %training :%test proportion, and the number
of times that the random process of partition will occur is
determined by the NrandomPartitions parameter. Additionally,
aLC parameter defines the number of learning cycles of the
classifier. The resultant matrix is MclassMean.
For a better understanding of the algorithm, the partition pro-
cess and the classifier are represented as separate implemen-
tations. The PARTITION function is responsible for the random
process of partition of the DataSet, using the following pa-
rameters as input: the data set, the training:test proportion, and
the number of classes. The function thus returns the respec-
tive training and test sets. The BAYESIANCLASSIFIER function
is the core process that implements the classifier. The classifier
is trained with the TrainingSet and evaluated with the TestSet.
Both tasks also require as input the number of classes, features,
and learning cycles. The function then returns a classification
confusion matrix Mclass. Finally, MclassMean is the resultant
confusion matrix, calculated as the average of all Mclass con-
fusion matrices, computed for each of the distinct random par-
4.3. Image similarity
Following class assignment of the xvector through a
Bayesian classifier, the next step is to know the level of sim-
ilarity between the query image and the assigned species. In
this sense, the prototype element of the class is the mean
of the normal density. Considering a training set composed
by samples x1,...,x
n, the prototype of this set is the average
of the samples. Thus, we adopted this prototype as the most
representative element for each class.
The Mahalanobis distance is used as a similarity metric
between the element xclassified in class iand its
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910 1905
prototype i. This distance is adequate for multivariate normal
data, that tends to cluster around the mean vector , falling
in an ellipsoidally shaped cloud whose principal axes are the
eigenvectors of the covariance matrix . Thus, the natural
measure of the distance from xto the mean is provided by
the quantity
r2=(x)t1(x). (4)
5. Results
5.1. Feature space and selection
The feature space is defined by features divided into three
groups: curvature, geometry, and texture. In the present work,
the feature vectors are 13D. Table 2 displays the 13 morpho-
logical features utilized for shape characterization.
Feature selection is a NP-hard problem [44]. A possible
method that guarantees an optimal solution is the exhaus-
tive search, where all combinations of features subsets are
tested. For each combination, we used the separability criteria
(Bayesian classifier, described in Section 4.1) and selected
the best feature vector combination [42]. These methods, also
known as sequential methods, are the mainstream approach
for performing feature selection and are guaranteed to find the
optimal subset [45].
Using a sequential forward selection (SFS) [46] test for each
number and combination of features, we generated subsets that
were subsequently processed by the classification process (Al-
gorithm 1). To determine the overall rate of correct classifica-
tion, each subset was then divided randomly into training (30%)
and test (70%) sets. This process was repeated 100 times for
each feature combination.
The best combination of features, yielding the highest value
of correct classification, was determined and selected for each
number of utilized features, varying from two to 13 combined
features. Table 3 shows the results obtained for the different
Table 2
Feature space for morphological characteristics of Eimeria spp. of domestic
Type ID Feature name
Curvature 1 Mean of curvature
2 Standard deviation of curvature
3 Entropy of curvature
Geometry 4 Major axis
5 Minor axis
6 Symmetry through major axis
7 Symmetry through minor axis
8 Area
9 Entropy of oocyst content
Texture 10 Angular second moment
11 Contrast
12 Inverse difference moment
13 Entropy
The 13D space is divided into three types of features: curvature, geometry,
and texture.
numbers of utilized features. Thus, the best combination of two
features (4 and 5) yielded a correct classification of 77.25%.
The highest correct classification value (85.90%) overall was
obtained with a combination of 12 features. Since the correct
classification rates observed by using 10–13 features varied
within the range of one standard deviation (data not shown), we
decided to employ the 13 features in all subsequent analyses.
5.2. Evaluation of the size of the training set
In order to estimate the minimum number of samples re-
quired for the training set, still able to yield an acceptable rate
of correct classification, we conducted a series of experiments.
Considering that the number of samples for each species was
not the same, we randomly extracted 320 elements from each
class. In this regard, a total of 2240 oocyst samples of the seven
Eimeria species were used. For each species, the corresponding
data set was randomly divided into two groups, the training set
and the test set, in relative proportions varying from 95%:5%
to 5%:95%, respectively, using intervals defined by integers. In
addition, for each proportion, the number of random partitions
was 100. The average of the diagonal of the resultant confusion
matrix, obtained for each of these 100 paired sets, was consid-
ered as the final score of correct classification for that particular
proportion of training:test sets. This approach was repeated to
the different training:test set percentages using Algorithm 1.
As can be seen in Fig. 4, there is a clear correlation between
the size of the training set and the overall accuracy of the classi-
fication. For a data set size of 2240 images, a good compromise
of training set size and accuracy was attained with circa 30% of
the images. Considering that the data set is constituted by 2240
samples from the seven distinct Eimeria species, we conclude
that a minimum acceptable size for the training set would be
96 images for each species, comprising a total of 672 samples
(30% of the data set). In fact, using distinct smaller data sets,
we confirmed that this absolute number of oocyst images per
species was adequate for training purposes (data not shown).
5.3. Analysis of species differentiation
Species differentiation experiments were performed with a
data set of 3891 oocyst images, comprising multiple strains of
the different Eimeria species that infect the domestic fowl. The
complete list of the strains and species utilized in this work
is presented in Table 1. From the overall data set, we used
30% of the images for the training set, and 70% for the test
set. A total of 100 paired sets were randomly generated and
each one was used as an input for the classification process
(see Algorithm 1), which in turn generated a confusion matrix
as a result of species discrimination. Therefore, at the end of
the recursive process, we generated 100 confusion matrices
which were used to compute the average confusion matrix. This
latter matrix contained the mean of correct classification for all
tested species. Finally, by computing the diagonal average, we
obtained the overall percentage of correct classification of the
1906 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910
Table 3
Feature selection using the SFS test
# of features Curvature Geometry Texture Rate (%)
123 45678910111213
2×× 77.25
3××× 79.90
4××× × 81.02
5× ××× × 82.45
6× ××× × × 83.89
7× ××× × × × 85.04
8× × ××× × × × 85.64
9× ×× ××× × × × 85.63
10 × ×× ×××××××85.75
11 × × ×× ×××××××85.73
12 × × ××××××××××85.90
13 ××× ××××××××××85.75
The best combination of features and the resulting correct species classification values are presented.
Fig. 4. Effect of the size of the training set on the classification accuracy. A
total of 2240 images were used for the evaluation. The size of the training
set is represented as percentages relative to the whole data set. The absolute
number of images is also presented (in parentheses).
The overall percentage of correct species assignment ob-
served was 85.75%. Table 4 presents the final confusion matrix,
where we can clearly see that the best classification was ob-
tained for E. maxima (99.21%). Conversely, E. praecox and E.
necatrix presented the worst results, with 74.23% and 74.90%
of correct discrimination rates, respectively. These results were
due to a cross-classification with other Eimeria species. Thus,
E. necatrix was incorrectly classified as E. acervulina (6.10%)
and E. tenella (9.94%). Similarly, some other Eimeria species
were also incorrectly classified as E. necatrix (E. acervulina in
12.53%, E. praecox in 10.94% and E. tenella in 12.22%). These
results show that E. necatrix and E. praecox are certainly the
most difficult species to be differentiated due to the morpho-
logical similarity among themselves and to other species. This
is in agreement with what is classically reported by personnel
involved with visual inspection and classification of Eimeria
field samples.
5.4. A real-time diagnosis system
As a proof-of-principle that our approach could be ap-
plied for the automatic morphological discrimination of
Eimeria species, we developed COCCIMORPH, a real-
time system accessible through a web interface (available at COCCIMORPH allows
the user to upload an image, detect the contour interactively and
obtain a real-time classification. Fig. 5 shows the framework of
this system, which is divided into the following three levels:
Database: This level stores the feature vectors that compose
the data set. Micrographs and isolated images are also stored
and can be visualized through a web interface.
Application: This is the developmental level of the system,
which is divided into three modules: import subsystem, anal-
ysis subsystem and application and web server.
Client: This level is oriented to interact with the end-user,
allowing for the visualization and uploading of images for
diagnostic purposes.
The analysis subsystem represents the kernel of the system and
is responsible for the image pre-processing, feature extraction
and pattern classification. This module was entirely developed
in C++, resulting in a rapid response of the system during the
image processing step, thus permitting a real-time processing
through the web.
Considering that different users have distinct setups of mi-
croscopes and digital cameras, the magnification and resolution
of the captured images can vary significantly from those used in
this work. In order to normalize the image scale, the user must
first determine the number of pixels/m of the captured image.
This can be simply done using a calibrated microscope scale,
such as those imprinted on specialized measuring slides. Alter-
natively, hemocytometer counting chambers, commonly used
in many laboratories, can also be employed. Once a picture of
the scale is obtained, the custom spatial resolution, expressed as
the number of pixels/m, can be easily determined using any
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910 1907
Table 4
Confusion matrix of species differentiation of Eimeria spp. of domestic fowl
Species Oocyst number Ascribed species
E. ace E. max E. bru E. mit E. pra E. ten E. nec
E. acervulina 636 83.83 0.01 0.00 1.26 0.29 2.07 12.53
E. maxima 321 0.00 99.21 0.79 0.00 0.00 0.00 0.00
E. brunetti 418 0.00 0.31 95.04 0.00 0.91 3.19 0.56
E. mitis 757 0.99 0.00 0.00 92.51 2.52 0.24 3.75
E. praecox 747 0.19 0.00 2.97 6.08 74.23 5.59 10.94
E. tenella 608 0.65 0.00 1.98 0.41 4.24 80.51 12.22
E. necatrix 404 6.10 0.00 0.53 3.98 4.55 9.94 74.90
Fig. 5. Framework of the real-time system for automatic diagnosis of Eimeria species.
image processing program (e.g. Gimp, Adobe Photoshop, etc.).
Provided that the user obtains all other subsequent images under
the same conditions, this step must be performed only once.
COCCIMORPH’s interface presents a “pixel/micrometer” fill
in the blank box where the user can enter custom values of
resolution. The system will then automatically normalize the
resolution in regard to the images of the database.
5.5. The Eimeria image database
A particularly helpful support for this work has been the am-
ple availability of biological samples. Thus, we constructed a
comprehensive database of oocyst micrographs, including par-
asite strains isolated from different regions of the world. This
repository was made publicly available as the “Eimeria Image
Database” through a link on the COCCIMORPH’s site.
6. Discussion
In this paper we report the development of an effective pat-
tern recognition approach for shape characterization and au-
tomatic discrimination of different species of the protozoan
parasite Eimeria spp. We propose the use of a set of features
comprising three categories: (a) curvature, (b) geometry, and
(c) texture. These features are extracted automatically and used
to compose a 13D feature vector. The system was developed
and standardized using microscopic images taken from pure
samples of each one of the parasite species. A large number
of images, comprising in total 3891 oocyst micrographs, was
used to reduce the effect of shape heterogeneity. In addition,
whenever available, we used several samples of each species,
collected from different geographic sources, in order to dilute
possible intra-specific variations and maximize inter-specific
1908 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910
discrimination. Other sources of data variability were also
assessed, including differences on microscope illumination
and contrast, as well as the volume of the parasite suspension
between the slide and coverslip. Finally, we used a relatively
high number of features, that were submitted to a feature selec-
tion process to evaluate how many and which of them would
compose the most discriminative set.
The approach described here is simple and permits a reliable
identification of the parasite species. Features are not limited
to the simplest and most traditional geometric measures, as
we also computed curvature to represent the form, and texture
for internal structure characterization. Considering that this
diagnostic system is based on morphology, the correct species
assignment rate obtained (85.75%) can be considered a very
good result, especially if compared to a subjective human
diagnosis. Furthermore, given the complexity of the algorithms
for feature extraction, the current implementation is computa-
tionally efficient, permitting a rapid and real-time interaction
of the end-user through a web interface.
Finally, because the system uses generic algorithms, it can be
easily extended to discriminate other organisms. For this task,
the user just needs to provide a new image data set and use it to
train the system to discriminate the different classes. In fact, a
preliminary study, including 11 Eimeria species that infect the
domestic rabbit, showed a similar discriminative performance
(data not shown).
Previous studies using digital image processing applied to
Eimeria [7–9] have been reported in the literature. These sys-
tems, however, were restricted to a semiautomatic oocyst diam-
eter measurement and still required a strong human interaction
during processing. In addition, most studies employed a small
number of morphological characters. Thus, some works used
as features the oocyst diameters [7,9], whereas others used the
Fourier transform of the contour [15] or computed statistics
from it [17].
Another general limitation was related to the classification
method, where multidimensional data distribution has not been
considered. Sommer [15] used Euclidean distance as a metric
for clusterization. This metric assumes that the data is homo-
geneously distributed, which is not necessarily the case, espe-
cially when multidimensional data is used. Yang et al. [17],
working with human helminth eggs, used four morphometric
features and two stages of ANNs. These ANNs were used for
the identification of eggs from artifacts, and for species dis-
crimination, respectively. However, the estimation of the aver-
age correct classification ratio was based on a very small image
data set, and the possible influence of intra-specific variability
was not assessed by the authors.
We also preliminarily considered alternative classification
methodologies, such as SVM [47,48]. More specifically, we
compared the performance of Bayesian classifier and SVM con-
sidering situations involving seven Eimeria categories and 13
features. Because the obtained results did not indicate superior
performance of the SVM methodology (actually, slightly bet-
ter results were achieved for the Bayesian classifier), we de-
cided to adopt the Bayesian methodology.An additional reason
motivating such a choice is the fact that the Bayesian classifier
is considerably simpler for on-line and interactive implemen-
tations of the system.
Several possible applications of our system can be foreseen
in a near future. Initially, the large image data set of Eimeria
oocysts was made publicly available as the Eimeria Image
Database. The database also includes now circa 2500 images
of 11 Eimeria species that infect the domestic rabbit. Since this
database can be added with new parasite images in the future,
it may represent an invaluable resource for classical parasitol-
ogists and also for teaching purposes. From the computational
standpoint, it represents a novel repository of parasite im-
age data, useful for experimental protocols involving pattern
recognition methods. As such, new algorithms could be tested
using this data set as a golden standard of validated bio-
logical samples.
In addition to the image database, the precise morphomet-
ric data of the different Eimeria species provides a unique op-
portunity to revisit the classic size estimations [10]. As such,
we intend to provide new parasite identification charts where
morphometric data will be presented in the light of the current
modern microscope optics and digital image technology. This
kind of data will certainly be of a high value to the Eimeria
scientific community, as well as to researchers in pattern recog-
nition, which may use such repository to test new measurement
and classification methodologies.
An envisaged application of the shape characterization
methodology described here is the implementation of a real-
time diagnostic tool through a web interface. In this direction,
we have created an experimental front-end for public access.
Since diagnosis is performed in real-time, there are almost no
delay between the sample querying and the final diagnostic
result. We foresee that such system would allow for a reliable
diagnosis with no need of biological sample transportation be-
tween the farms and the reference laboratory. This represents
a particularly important achievement, since live sample traffic
may represent a sanitary risk due to the potentiality of disease
dissemination. Also, compared to other diagnostic approaches,
our system does not require trained personnel on parasite iden-
tification or molecular biology techniques. The incorporation
of other parasites to the system may even increase the scope
of applicability of this electronic diagnostic tool. Coccidian
protozoa and helminth eggs, by presenting a morphology
similar to Eimeria oocysts, are the obvious candidates to be
included in a near future. With the current decreasing prices of
high resolution (above four megapixels) digital cameras, our
system is relatively cheap. In fact, any reasonable microscope
with a digital photo documentation system (a camera and an
adapter tube) would represent the minimum apparatus for such
Another aspect where shape characterization may have an
interesting impact is on phylogenetic analysis. Classic phylo-
genetics used to rely on morphometric data, but since DNA
sequencing became a mainstream and relatively cheap tech-
nique, most current inferences are now based on molecular data.
Because our morphological features have a quantitative repre-
sentation, they can be discretized and converted into data matri-
ces amenable to phylogenetic methods. Phylogenetic inference
C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910 1909
of the genus Eimeria has been reported using the ribosomal
18Ssequence [49]. Our group has recently characterized the
complete mitochondrial genome of the seven chicken Eimeria
species (Romano et al.—manuscript in preparation) and used
this data set to reconstruct the phylogeny of this group. Pre-
liminary results show a good agreement between inferences
based on these molecular markers and the morphological fea-
tures described in this work. Thus, morphometric data applied
to phylogenetic inference may provide an interesting counter-
part to molecular-based phylogenies, with potentially exciting
evolutionary implications.
7. Conclusions
In this paper, an effective shape characterization approach for
automatic species differentiation in Eimeria spp. is proposed.
The extracted features identify different morphological prop-
erties of the oocysts, related to the characterization of form,
geometry and internal structure. This shape representation was
applied for the differentiation of the seven Eimeria species of
domestic fowl, and the results revealed a good reliability of the
feature set. Finally, a real-time diagnosis system was imple-
mented and made available for the scientific community. We
believe that our system demonstrates the feasibility of using
computer-assisted systems to provide an interesting alternative
for the rapid diagnosis of parasites.
Luciano da F. Costa (308231/03-1) and Arthur Gruber
(306793/2004-0) are grateful to CNPq for financial support.
César A.B. Castañón received a fellowship from CAPES and
the work presented herein formed part of his Ph.D. Thesis.
Jane S. Fraga and Sandra Fernandez received fellowships from
CNPq and FAPESP, respectively.
[1] D. Comaniciu, P. Meer, D. Foran, Image-guided decision support system
for pathology, Mach. Vision Appl. 11 (4) (1999) 213–224.
[2] D. Sabino, L. Costa, E. Rizzatti, M. Zago, A texture approach to
leukocyte recognition, Real-Time Imaging 10 (4) (2004) 205–216.
[3] A. Jalba, M. Wilkinson, J. Roerdink, Shape representation and recognition
through morphological curvature scale spaces, IEEE Trans. Image
Process. 15 (2) (2006) 331–341.
[4] S. Trattner, H. Greenspan, G.Tepper, S.Abboud, Automatic identification
of bacterial types using statistical imaging methods, IEEE Trans. Med.
Imaging 23 (7) (2004) 807–820.
[5] X. Long, W. Cleveland, Y. Yao, Effective automatic recognition of
cultured cells in bright field images using Fisher’s linear discriminant
preprocessing, Image Vision Comput. 23 (13) (2005) 1203–1213.
[6] M. Sampat, A. Bovik, J. Aggarwal, K. Castleman, Supervised parametric
and non-parametric classification of chromosome images, Pattern
Recognition 38 (8) (2005) 1209–1223.
[7] J. Kucera, M. Reznicky, Differentiation of species of Eimeria from
the fowl using a computerized image-analysis system, Folia Parasitol.
(Praha) 2 (38) (1991) 107–113.
[8] A. Daugschies, S. Imarom, W. Bollwahn, Differentiation of porcine
Eimeria spp. by morphologic algorithms, Vet. Parasitol. 81 (3) (1999)
[9] A. Plitt, S. Imarom, A. Joachim, A. Daugschies, Interactive classification
of porcine Eimeria spp. by computer-assisted image analysis, Vet.
Parasitol. 86 (2) (1999) 105–112.
[10] P.L. Long, B.J. Millard, L.P. Joyner, C.C. Norton, A guide to laboratory
techniques used in the study and diagnosis of avian coccidiosis, Folia
Vet. Lat. 6 (3) (1976) 201–217.
[11] B.E. Schnitzler, P.L. Thebo, J.G. Mattsson, F.M. Tomley, M.W. Shirley,
Development of a diagnostic PCR assay for the detection and
discrimination of four pathogenic Eimeria species of the chicken, Avian
Pathol. 27 (5) (1998) 490–497.
[12] B.E. Schnitzler, P.L. Thebo, F.M. Tomley, A. Uggla, M.W. Shirley, PCR
identification of chicken Eimeria: a simplified read-out, Avian Pathol.
28 (1) (1999) 89–93.
[13] S. Fernandez, A.H. Pagotto, M.M. Furtado, A.M. Katsuyama, A.M.
Madeira, A. Gruber, A multiplex PCR assay for the simultaneous
detection and discrimination of the seven Eimeria species that infect
domestic fowl, Parasitology 127 (4) (2003) 317–325.
[14] A. Joachim, N. Dulmer, A. Daugschies, Differentiation of two
Oesophagostomum spp. from pigs, O. dentatum and O. quadrispinulatum,
by computer-assisted image analysis of fourth-stage larvae, Parasitol.
Int. 48 (1) (1999) 63–71.
[15] C. Sommer, Quantitative characterization, classification and recon-
struction of oocyst shapes of Eimeria species from cattle, Parasitology
116 (1) (1998) 21–28.
[16] C. Sommer, Quantitative characterization of texture used for
identification of eggs of bovine parasitic nematodes, J. Helminthol. 72
(2) (1998) 179–182.
[17] Y. Yang, D. Park, H. Kim, M. Choi, J. Chai, Automatic identification
of human helminth eggs on microscopic fecal specimens using digital
image processing and an artificial neural network, IEEE Trans. Biomed.
Eng. 48 (6) (2001) 718–730.
[18] K.W. Widmer, K.H. Oshima, S.D. Pillai, Identification of
Cryptosporidium parvum oocysts by an artificial neural network
approach, Appl. Environ. Microbiol. 68 (3) (2002) 1115–1121.
[19] O. Bruno, R. Cesar Jr., L. Consularo, L. Costa, Automatic feature
selection for biological shape classification in SYNERGOS, in:
Proceedings of the SIBGRAPI’98, International Symposium on
Computer Graphics, Image Processing, and Vision, 1998, pp. 363–370.
[20] R. Coelho, V.D. Gesù, G.L. Bosco, J. Tanaka, C. Valenti, Shape-based
features for cat ganglion retinal cells classification, Real-Time Imaging
8 (3) (2002) 213–226.
[21] L. Costa, S. dos Reis, R. Arantes, A. Alves, G. Mutinari, Biological
shape analysis by digital curvature, Pattern Recognition 37 (3) (2004)
[22] D. Regan, Human Perception of Objects, York University, New York,
[23] B. Olshausen, D. Field, Vision and the coding of natural images, Am.
Sci. 88 (3) (2000) 238–245.
[24] D. Zhang, G. Lu, Review of shape representation and description
techniques, Pattern Recognition 37 (1) (2004) 1–19.
[25] L. Costa, R. Cesar Jr., Shape Analysis and Classification: Theory and
Practice, CRC Press, Boca Raton, FL, 2000.
[26] R. Gonzales, R. Woods, Digital Image Processing, Addison-Wesley,
Reading, MA, 1993.
[27] F. Attneave, Some informational aspects of visual perception, Psychol.
Rev. 61 (3) (1954) 183–193.
[28] L. Riggs, Curvature as a feature of pattern vision, Science 181 (4104)
(1973) 1070–1072.
[29] F. Mokhtarian, A. Mackworth, A theory of multiscale, curvature-based
shape representation for planar curves, IEEE Trans. Pattern Anal. Mach.
Intell. 14 (8) (1992) 789–805.
[30] R. Cesar Jr., L. Costa, Towards effective planar shape representation
with multiscale digital curvature analysis based on signal processing
techniques, Pattern Recognition 29 (9) (1996) 1559–1569.
[31] H. Weyl, Symmetry, Princeton University Press, New Jersey, 1980.
[32] H. Zabrodsky, S. Peleg, D. Avnir, A measure of symmetry based on
shape similarity, in: Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR92), 1992, pp. 703–706.
1910 C.A.B. Castañón et al. / Pattern Recognition 40 (2007) 1899– 1910
[33] M. Brady, H. Asada, Smoothed local symmetries and their
implementation, Technical Report, Cambridge, MA, USA, 1984.
[34] J. Sato, R. Cipolla, Affine integral invariants for extracting symmetry
axes, Image Vision Comput. 15 (8) (1997) 627–635.
[35] Y. Bonneh, D. Reisfeld, Y. Yeshurun, Quantification of local symmetry:
application to texture discrimination, Spat. Vision 8 (4) (1994) 515–530.
[36] B. Zavidovique, V.D. Gesù, Kernel based symmetry measure, in: ICIAP,
2005, pp. 261–268.
[37] I. Young, J. Walker, J. Bowie, An analysis technique for biological shape
I, Inform. Control 25 (4) (1974) 357–370.
[38] M. Tuceryan, A. Jain, Texture analysis, in: C.H. Chen, L.F. Pau, P.S.P.
Wang (Eds.), The Handbook of Pattern Recognition and Computer
Vision, second ed., World Scientific Publishing Co., Singapore, 1998,
pp. 207–247.
[39] R. Haralick, K. Shanmugam, I. Dinstein, Textural features for image
classification, IEEE Trans. Systems Man Cybern. SMC-3 (6) (1973)
[40] R. Jobanputra, D. Clausi, Preserving boundaries for image texture
segmentation using grey level co-occurring probabilities, Pattern
Recognition 39 (2) (2006) 234–245.
[41] R. Conners, Towards a set of statistical features which measure visually
perceivable qualities of texture, in: Proceedings of Pattern Recognition
Image Processing Conference, 1979, pp. 382–390.
[42] S. Theodoridis, K. Koutroumbas, Pattern Recognition, Academic Press,
San Diego, 1998.
[43] R. Duda, P. Hart, D. Stork, Pattern Classification, Wiley, New York,
[44] P.M. Narendra, K. Fukunaga, A branch and bound algorithm for feature
subset selection, IEEE Trans. Comput. 26 (9) (1977) 917–922.
[45] A. Jain, R. Duin, J. Mao, Statistical pattern recognition: a review, IEEE
Trans. Pattern Anal. Mach. Intell. 22 (1) (2000) 4–37.
[46] A. Jain, D. Zongker, Feature selection: evaluation, application, and small
sample performance, IEEE Trans. Pattern Anal. Mach. Intell. 19 (2)
(1997) 153–158.
[47] N. Cristianini, J. Shawe-Taylor, An Introduction to Support Vector
Machines and Other Kernel-based Learning Methods, Cambridge
University Press, Cambridge, 2000.
[48] K. Crammer, Y. Singer, On the algorithmic implementation of multiclass
kernel-based vector machines, J. Mach. Learn. Res. 2 (5) (2001)
[49] J.R. Barta, D.S. Martin, P.A. Liberator, M. Dashkevicz, J.W. Anderson,
S.D. Feighner, A. Elbrecht, A. Perkins-Barrow, M.C. Jenkins, H.D.
Danforth, M.D. Ruff, H. Profous-Juchelka, Phylogenetic relationships
among eight Eimeria species infecting domestic fowl inferred using
complete small subunit ribosomal DNA sequences, Parasitology 83 (2)
(1997) 262–271.
About the Author—CÉSAR ARMANDO BELTRÁN CASTAÑÓN holds a B.Sc. degree from Universidad Católica de Santa María, Peru, in Systems
Engineering, and a M.Sc. in Computer Science from the University of São Paulo (USP), Brazil. He is currently a Ph.D. student in Bioinformatics at the USP.
His research interests include 2D image shape analysis, pattern recognition, feature extraction and selection, content-based image retrieval, and computational
About the Author—JANE SILVEIRA FRAGA holds a Veterinary Medicine degree from the University of São Paulo (USP). She is currently finishing her
Ph.D. Thesis on the characterization of dsRNA viruses infecting Eimeria spp. of domestic fowl.
About the Author—SANDRA FERNANDEZ holds a Veterinary Medicine degree and a Ph.D. of Parasitology (USP). She is currently heading a research and
development group at Laboratório Biovet S/A, a private company that is a major vaccine producer in Brazil.
About the Author—ARTHUR GRUBER holds a Veterinary Medicine degree, and a Ph.D. in Biochemistry (USP). He is currently an Associate Professor at
the Department of Parasitology of the USP. His main research interests include molecular biology and genomics of coccidian parasites, and the development
of bioinformatics applications for sequence analysis.
About the Author—LUCIANO DA FONTOURA COSTA holds a B.Sc. in Electronic Engineering and Computer Science, a M.Sc. in Applied Physics, and a
Ph.D. in Electronic Engineering (King’s College, University of London). He is currently a Full Professor at the USP. His main interests include natural and
artificial vision, shape analysis, pattern recognition, computational neuroscience, computational biology, and bioinformatics.
... Intestinal parasites are considered of great interest in the implementation of algorithms for automated identification based on diagnostic imaging, because these organisms have stages of development with well-defined and reasonably homogeneous morphology (44). ...
... These two protozoans were detected using Artificial Neural Networks (ANN), which correctly identified 91.8 of the images of the Cryptosporidium oocyst and 99.6% the Giardia cyst, respectively, indicating that it can be extremely useful in automated diagnostics (66). The above-mentioned techniques have several limitations, such as the difficulty of quantifying morphological characteristics, allied to the high complexity of the algorithms (44). ...
... The purpose of image analysis is to classify and recognize objects of interest in digital images. This can be done in several ways, e.g., by identifying the colors, textures, shapes, movements and position of the objects in the images (44). Thus, different species of Eimeria found in farmyard chickens were included in the study of automated identification. ...
Full-text available
The increasingly close proximity between people and animals is of great concern for public health, given the risk of exposure to infectious diseases transmitted through animals, which are carriers of more than 60 zoonotic agents. These diseases, which are included in the list of Neglected Tropical Diseases, cause losses in countries with tropical and subtropical climates, and in regions with temperate climates. Indeed, they affect more than a billion people around the world, a large proportion of which are infected by one or more parasitic helminths, causing annual losses of billions of dollars. Several studies are being conducted in search for differentiated, more sensitive diagnostics with fewer errors. These studies, which involve the automated examination of intestinal parasites, still face challenges that must be overcome in order to ensure the proper identification of parasites. This includes a protocol that allows for elimination of most of the debris in samples, satisfactory staining of parasite structures, and a robust image database. Our objective here is therefore to offer a critical description of the techniques currently in use for the automated diagnosis of intestinal parasites in fecal samples, as well as advances in these techniques.
... The main objective of these works is to reduce human error during the diagnosis of fecal parasites, and to generate faster, more efficient, and accurate results. Due to the well-characterized and reasonably homogeneous morphology of intestinal parasites, these organisms are considered to be of great interest in the implementation of algorithms for automated recognition based on diagnostic image processing [10]. With regard to efficiency of sample processing, automated diagnostics offers high precision in the recognition of host positivity and parasite load [11]. ...
Full-text available
IPIs caused by protozoan and helminth parasites are among the most common infections in humans in LMICs. They are regarded as a severe public health concern, as they cause a wide array of potentially detrimental health conditions. Researchers have been developing pattern recognition techniques for the automatic identification of parasite eggs in microscopic images. Existing solutions still need improvements to reduce diagnostic errors and generate fast, efficient, and accurate results. Our paper addresses this and proposes a multi-modal learning detector to localize parasitic eggs and categorize them into 11 categories. The experiments were conducted on the novel Chula-ParasiteEgg-11 dataset that was used to train both EfficientDet model with EfficientNet-v2 backbone and EfficientNet-B7+SVM. The dataset has 11,000 microscopic training images from 11 categories. Our results show robust performance with an accuracy of 92%, and an F1 score of 93%. Additionally, the IOU distribution illustrates the high localization capability of the detector.
... Recently developed automated counting methods, such as the Kubic FLOTAC [69], Parasight [15,38,70] or Telenostic for cattle [71], as well as smartphone-based solutions are currently being investigated and developed [15,72]. The advantages of automated counting processes are shorter hands-on times, reduced requirements of personnel skills and training, lower personnel costs, reduced material costs and a reduced need for spatial capacities [23,[73][74][75][76][77][78][79]. More importantly, however, subjective fluctuations by examiners are eliminated, making the results more comparable [23,28,73,74,79]. ...
Full-text available
Background Due to high prevalence of anthelmintic resistance in equine helminths, selective treatment is increasingly promoted and in some countries a positive infection diagnosis is mandatory before treatment. Selective treatment is typically recommended when the number of worm eggs per gram faeces (epg) exceeds a particular threshold. In the present study we compared the semi-quantitative sedimentation/flotation method with the quantitative methods Mini-FLOTAC and FECPAK G2 in terms of precision, sensitivity, inter-rater reliability and correlation of worm egg counts to improve the choice of optimal diagnostic tools. Methods Using sedimentation/flotation (counting raw egg numbers up to 200), we investigated 1067 horse faecal samples using a modified Mini-FLOTAC approach (multiplication factor of 5 to calculate epgs from raw egg counts) and FECPAK G2 (multiplication factor of 45). Results Five independent analyses of the same faecal sample with all three methods revealed that variance was highest for the sedimentation/flotation method while there were no significant differences between methods regarding the coefficient of variance. Sedimentation/flotation detected the highest number of samples positive for strongyle and Parascaris spp. eggs, followed by Mini-FLOTAC and FECPAK G2 . Regarding Anoplocephalidae, no significant difference in frequency of positive samples was observed between Mini-FLOTAC and sedimentation/flotation. Cohen’s κ values comparing individual methods with the combined result of all three methods revealed almost perfect agreement (κ ≥ 0.94) for sedimentation/flotation and strong agreement for Mini-FLOTAC (κ ≥ 0.83) for strongyles and Parascaris spp. For FECPAK G2 , moderate and weak agreements were found for the detection of strongyle (κ = 0.62) and Parascaris (κ = 0.51) eggs, respectively. Despite higher sensitivity, the Mini-FLOTAC mean epg was significantly lower than that with FECPAK G2 due to samples with > 200 raw egg counts by sedimentation/flotation, while in samples with lower egg shedding epgs were higher with Mini-FLOTAC than with FECPAK G2 . Conclusions For the simple detection of parasite eggs, for example, to treat foals infected with Parascaris spp., sedimentation/flotation is sufficient and more sensitive than the other two quantitative investigared in this study. Mini-FLOTAC is predicted to deliver more precise results in faecal egg count reduction tests due to higher raw egg counts. Finally, to identify animals with a strongyle epg above a certain threshold for treatment, FECPAK G2 delivered results comparable to Mini-FLOTAC. Grpahical Abstract
... Peculiarity [17], Early study on Eimeria taxonomy and their relations to Small businesses was entirely focused on morphological and biological characteristics [18]. The evolutionary relationships of Eimeria species infecting the bird have been determined according to Barta and colleagues' analysis of specific subunit (18S) rDNA gene sequences [19]. ...
Dep. Pathology and Poultry Diseases /coll. Vet.Med./ AL-Qadissiyah University Summary Coccdiosis consider sever disorder and necessary from an financial factor of view in poultry industry which has been controlled efficaciously for decades using more often than not anticoccidial products so the understood this disorder may additionally be help to control it. The economic impact of coccidiosis is probably underestimated, and increasing anticoccidial treatments might be useful to the broiler sector. Furthermore, a connection between subclinical coccidiosis and bacterial enteritis renders selecting the perfect tools and technique for poultry producers very difficult. Implementing sound shuttle and rotation programs is now a part of the solution for not just controlling clinical, but also subclinical coccidiosis.
... Peculiarity [17], Early study on Eimeria taxonomy and their relations to Small businesses was entirely focused on morphological and biological characteristics [18]. The evolutionary relationships of Eimeria species infecting the bird have been determined according to Barta and colleagues' analysis of specific subunit (18S) rDNA gene sequences [19]. ...
... Then SOM clustering algorithm was used to identify five different dinocysts clusters, namely Proximate dinocysts, Freshwater algae, Proximochorate dinocysts, Chorate dinocysts and Proximate dinocysts with long horns. In the next year, Castonan et al. [92] presented an image recognition system for Eimeria species. The dataset included 3891 micrographs of oocyst of seven Eimeria species. ...
Microorganisms or microbes comprise majority of the diversity on earth and are extremely important to human life. They are also integral to processes in the ecosystem. The process of their recognition is highly tedious, but very much essential in microbiology to carry out different experimentation. To overcome certain challenges, machine learning techniques assist microbiologists in automating the entire process. This paper presents a systematic review of research done using machine learning (ML) and deep leaning techniques in image recognition of different microorganisms. This review investigates certain research questions to analyze the studies concerning image pre-processing, feature extraction, classification techniques, evaluation measures, methodological limitations and technical development over a period of time. In addition to this, this paper also addresses the certain challenges faced by researchers in this field. Total of 100 research publications in the chronological order of their appearance have been considered for the time period 1995–2021. This review will be extremely beneficial to the researchers due to the detailed analysis of different methodologies and comprehensive overview of effectiveness of different ML techniques being applied in microorganism image recognition field.
Full-text available
Coccidiosis is the main parasitic disease resulting from the intracellular protozoan that targets each different part of the intestinal tract leading to destroy in poultry. For this reason, coccidiosis induces an enormous economic loss in the poultry industry. Eimeria life cycle is complicated and comprised of exogenous and endogenous stages inducing an inflammatory response which results in enteric damage associated with diarrheal hemorrhage, disorder digestion of feed and nutrient absorption, dehydration, blood loss, mortality. Hence, it is very important to understand the information of Eimeria parasites for elimination and treatment. This disease has been controlled by various anticoccidial drugs and vaccines as the most common management practices. However, not only the occurrence of drug resistance due to anticoccidial drugs but lack of a guarantee of safety with vaccine use, has led to the development of alternative strategies to control coccidiosis. For these reasons, phytogenic compounds are emerging for the control and prevention of poultry coccidiosis to alternate previous methods. The main aim of this review is to provide an overview of coccidiosis including etiology, morphology, life cycle, pathogenicity, clinical sign, diagnosis, control and prevention.
Interpretation errors may still represent a limiting factor for diagnosing Cryptosporidium spp. oocysts with the conventional staining techniques. Humans and machines can interact to solve this problem. We developed a new temporary staining protocol associated with a computer program for the diagnosis of Cryptosporidium spp. oocysts in fecal samples. We established 62 different temporary staining conditions by studying 20 experimental protocols. Cryptosporidium spp. oocysts were concentrated using the Three Fecal Test (TF-Test ® ) technique and confirmed by the Kinyoun method. Next, we built a bank with 299 images containing oocysts. We used segmentation in superpixels to cluster the patches in the images; then, we filtered the objects based on their typical size. Finally, we applied a convolutional neural network as a classifier. The trichrome modified by Melvin and Brooke, at a concentration use of 25%, was the most efficient dye for use in the computerized diagnosis. The algorithms of this new program showed a positive predictive value of 81.3 and 94.1% sensitivity for the detection of Cryptosporidium spp. oocysts. With the combination of the chosen staining protocol and the precision of the computational algorithm, we improved the Ova and Parasite exam (O&P) by contributing in advance toward the automated diagnosis.
Full-text available
Avian coccidiosis continues to be one of the costliest diseases of commercial poultry. Understanding the epidemiology of Eimeria species in poultry flocks and the resistance profile to common anticoccidials is important to design effective disease prevention and control strategies. This study examined litter samples to estimate the prevalence and distribution of Eimeria species among broiler farms in four geographic regions of Colombia. A total of 245 litter samples were collected from 194 broiler farms across representative regions of poultry production between March and August 2019. The litter samples were processed for oocysts enumeration and speciation after sporulation. End-point PCR analysis was conducted to confirm the presence of Eimeria species. Anticoccidial sensitivity was determined with 160 Ross AP males in 5 treatment groups: non-infected, non-medicated control (NNC), infected, non-medicated control (INC), infected salinomycin treated (SAL, dose: 66 ppm), infected diclazuril treated (DIC, dose: 1ppm), and infected methylbenzocuate-Clopidol treated (MET.CLO, dose: 100ppm), All birds were orally inoculated with 1 × 10⁶ sporulated oocysts using a 1 mL syringe, except for the NNC- group who received 1ml of water. Eimeria spp. were found in 236 (96.3%) out of 245 individual houses, representing 180 (92.8%) out of 194 farms. E. acervulina was the most prevalent species (35.0%) followed by E. tenella (30.9%), E. maxima (20.4%), and other Eimeria spp. (13.6%). However, mixed species infections were common, with the most prevalent combination being mixtures of E. acervulina, E. maxima, E. tenella, and other species in 31.4% of the Eimeria-positive samples. PCR analysis identified E. acervulina, E. maxima, E. tenella, E. necatrix, E. mitis, and E. praecox with variable prevalence across farms and regions. Anticoccidial sensitivity testing of strains of Eimeria isolated from one region, no treatment difference (P > 0.05) was observed in final weight (BW), weight gain (BWG) or feed conversion (FCR). For the global resistance index (GI) classified SAL and MET.CLO as good efficacy (85.79 and 85.49, respectively) and DIC as limited efficacy (74.52%). These results demonstrate the ubiquitous nature of Eimeria spp. and identifies the current state of sensitivity to commonly used anticoccidials in a region of poultry importance for Colombia.
Conference Paper
Full-text available
The authors view symmetry as a continuous feature, and define a continuous symmetry measure (CSM) of shapes. The general definition of symmetry measure allows a comparison of the amount of symmetry of different shapes and the amount of different symmetries of a single shape. Furthermore, the CSM is associated with the symmetric shape that is closest to the given one, enabling visual evaluation of the CSM
Full-text available
This chapter reviews and discusses various aspects of texture analysis. The concentration is on the various methods of extracting textural features from images. The geometric, random field, fractal, and signal processing models of texture are presented. The major classes of texture processing problems such as segmentation, classification, and shape from texture are discussed. The possible application areas of texture such as automated inspection, document processing, and remote sensing are summarized. A bibliography is provided at the end for further reading.
Full-text available
We describe a polymerase chain reaction (PCR)-based assay for the detection, identification and differentiation of pathogenic species of Eimeria in poultry. The internal transcribed spacer 1 (ITS1) regions of ribosomal DNA (rDNA) from Eimeria acervulina, E. brunetti, E. necatrix and E. tenella were sequenced and regions of unique sequences identified. Four pairs of oligonucleotide primers, each designed to amplify the ITS1 region of a single Eimeria species, were synthesised for use in the PCR assay. In tests on purified genomic DNA from all seven species of Eimeria that infect the chicken, each of the four primer pairs amplified the ITS1 region from only their respective target species. The robustness of the approach was further demonstrated by the amplification of specific DNA fragments from tissues of experimentally infected animals and from oocysts recovered from field samples. We conclude that the ITS1 regions of Eimeria species contain sufficient inter-specific sequence variation to enable the selection of primers that can be applied in PCR analyses to detect and differentiate between species. In future work they may provide excellent markers for epidemiological studies.
The human brain may hold the secrets to the best image-compression algorithms.
Beginning studies are described aimed at defining a set of features to be used with the Spatial Gray Level Dependence Method (SGLDM) which will measure image characteristics believed to be used by humans in the discrimination of textures. In particular a procedure is described which will allow the SGLDM to be used to measure the unit cell size of textures and the coarseness of textures.
Features Serves as both an introduction to and a reference for computer-based analysis and recognition of shapes Includes a comprehensive review of the basic mathematical concepts involved Examines various techniques for shape characterization and analysis, including shape contour analysis and extraction of different shape measures for statistical classification Explains several multiscale techniques, such as wavelets and multiscale skeletonization Focuses on two-dimensional shapes but includes concepts and techniques that can be generalized for 3-D shapes Identifies future trends and developments Includes numerous illustrations and real-world examples Summary Advances in shape analysis impact a wide range of disciplines, from mathematics and engineering to medicine, archeology, and art. Anyone just entering the field, however, may find the few existing books on shape analysis too specific or advanced, and for students interested in the specific problem of shape recognition and characterization, traditional books on computer vision are too general. Shape Analysis and Classification: Theory and Practice offers an integrated and conceptual introduction to this dynamic field and its myriad applications. Beginning with the basic mathematical concepts, it deals with shape analysis, from image capture to pattern classification, and presents many of the most advanced and powerful techniques used in practice. The authors explore the relevant aspects of both shape characterization and recognition, and give special attention to practical issues, such as guidelines for implementation, validation, and assessment. Shape Analysis and Classification provides a rich resource for the computational characterization and classification of general shapes, from characters to biological entities. Both students and researchers can directly use its state-of-the-art concepts and techniques to solve their own problems involving the characterization and classification of visual shapes.