Conference PaperPDF Available

Retinal blood vessel segmentation in high resolution fundus photographs using automated feature parameter estimation


Abstract and Figures

Several ophthalmological and systemic diseases are manifested through pathological changes in the properties and the distribution of the retinal blood vessels. The characterization of such alterations requires the segmentation of the vasculature, which is a tedious and time-consuming task that is infeasible to be performed manually. Numerous attempts have been made to propose automated methods for segmenting the retinal vasculature from fundus photographs, although their application in real clinical scenarios is usually limited by their ability to deal with images taken at different resolutions. This is likely due to the large number of parameters that have to be properly calibrated according to each image scale. In this paper we propose to apply a novel strategy for automated feature parameter estimation, combined with a vessel segmentation method based on fully connected conditional random fields. The estimation model is learned by linear regression from structural properties of the images and known optimal configurations, that were previously obtained for low resolution data sets. Our experiments in high resolution images show that this approach is able to estimate appropriate configurations that are suitable for performing the segmentation task without requiring to re-engineer parameters. Furthermore, our combined approach reported state of the art performance on the benchmark data set HRF, as measured in terms of the F1-score and the Matthews correlation coefficient.
Content may be subject to copyright.
Retinal blood vessel segmentation in high resolution fundus
photographs using automated feature parameter estimation
Jos´e Ignacio Orlandoa,b,c, Marcos Fracchiac, Valeria del R´ıocand Mariana del Fresnob,c,d
aConsejo Nacional de Investigaciones Cient´ıficas y T´ecnicas (CONICET), Argentina;
bPladema Institute, Gral. Pinto 399, 7000 Tandil, Argentina;
cFacultad de Ciencias Exactas, UNCPBA, Pinto 399, 7000 Tandil, Argentina;
dComisi´on de Investigaciones Cient´ıficas de la Provincia de Buenos Aires (CIC-PBA),
Buenos Aires, Argentina;
Several ophthalmological and systemic diseases are manifested through pathological changes in the properties and
the distribution of the retinal blood vessels. The characterization of such alterations requires the segmentation
of the vasculature, which is a tedious and time-consuming task that is infeasible to be performed manually.
Numerous attempts have been made to propose automated methods for segmenting the retinal vasculature from
fundus photographs, although their application in real clinical scenarios is usually limited by their ability to deal
with images taken at different resolutions. This is likely due to the large number of parameters that have to
be properly calibrated according to each image scale. In this paper we propose to apply a novel strategy for
automated feature parameter estimation, combined with a vessel segmentation method based on fully connected
conditional random fields. The estimation model is learned by linear regression from structural properties of
the images and known optimal configurations, that were previously obtained for low resolution data sets. Our
experiments in high resolution images show that this approach is able to estimate appropriate configurations that
are suitable for performing the segmentation task without requiring to re-engineer parameters. Furthermore,
our combined approach reported state of the art performance on the benchmark data set HRF, as measured in
terms of the F1-score and the Matthews correlation coefficient.
Keywords: Retinal vessel segmentation, Fundus imaging, Parameter estimation
Fundus photographs are a cost effective, non-invasive medical imaging modality that is widely used by ophthal-
mologists for manually inspecting the retina.1It is currently the most used imaging technique for screening
several ophthalmic diseases such as diabetic retinopathy2and glaucoma,3which are among the leading causes
of avoidable blindness in the world.4Current systems for automated fundus image analysis usually require to
segment the vasculature first,5as blood vessels aid in numerous applications. In particular, vessel segmentations
are used for characterizing pathological changes associated with ophthalmic and systemic diseases,1for localizing
other anatomical parts of the retina,6for detecting abnormalities such as red lesions,7, 8 and as landmarks for
multimodal image registration.9
Automated blood vessel segmentation is a challenging task that has been widely explored in the literature.1
In general, it is tackled by means of supervised or unsupervised methods. Unsupervised methods are based on
pixel level features such as Gabor filters,10 line detectors11 or morphological operation,12 among others.13 These
features are designed to characterize vascular pixels, and afterwards they are thresholded to retrieve a binary
representation of the vasculature.13 Supervised methods are built on top of these strategies, and are based first
on training a classifier from annotated data and then, on categorizing image pixels using such a model.10, 14, 15
Althought most of the current methods are able to achieve high performance on standard low resolution data
sets such as DRIVE16 and STARE,17 they usually fail when images are taken at higher resolutions. This is likely
Further author information: (Send correspondence to J.I.O.)
J.I.O.: E-mail:, Telephone: +54 249 4439690
(a) (b) (c)
Figure 1: Retinal blood vessel segmentation in high resolution fundus images. (a) Fundus photograph. (b)
Segmentation obtained using our method with parameters adapted using a scaling factor. (c) Segmentation
obtained using our method with parameters estimated using linear regression.
because the existing features are highly parametrized and require an intensive calibration process to improve
their original performance. However, it is extremely time consuming to tune these parameters using standard
searching approaches such as grid search.18 Furthermore, this process have to be repeated for every new data
set with a different resolution.
One alternative to overcome this issue is to downscale the images to approximately fit the same resolution than
those used for tuning parameters.3Nevertheless, this approach reduce the ability of the features to characterize
thin structures, which is relevant in clinical applications.1Other strategies are based on modeling resolution
changes and adjusting the feature parameters accordingly. In our previous work,19 we proposed to apply a scaling
factor, proportional to the change in the field of view (FOV) width, to automatically rescale those parameters.
Albeit this approach is able to improve results on high resolution data sets, the resulting segmentations suffer
from issues such as false negatives due to arteries central reflex (Figure 1), and are less accurate than the obtained
on low resolution images. Vostatek et al.20 have recently proposed a different strategy for predicting parameters,
based on using linear regression. Such an approach is focused on correlating these values with the angular
resolution of the images. A line is fitted to these points by minimizing the mean squared error, and its slope and
intercept are subsequently used to automatically predict the parameters suitable for a new given resolution.
In this study we propose to take advantage of this recently published strategy by integrating it with our
blood vessel segmentation method based on learning a fully connected conditional random field model.15, 19
In particular, we improve the original estimation strategy proposed by Vostatek et al.20 by analyzing other
structural parameters of the images that are also easily measurable. Moreover, we apply this estimator in
combination with our supervised method, which incorporates shape priors within the learning process to better
capture the interaction between vascular pixels.15, 19 Our hypothesis is that integrating this parameter estimator
with our segmentation approach will result in better performance than merely using simple pixel classifiers such
as Gaussian Mixture Models (GMMs).10, 20 We have evaluated this adaptive model on a benchmark data set
of high resolution fundus images, HRF,18 which is widely used in the literature for evaluating segmentation
methods. Our results empirically shows that this hybrid approach significantly improve the original performance
of the method, outperforming other existing strategies evaluated following a similar protocol.
The remainder of this paper is organized as follows. Section 2 explains our method, including details about
the selected features, our segmentation approach and the parameter estimation strategy. Section 3 describes the
data sets used in our experiments and the quantitative metrics applied for evaluation, while Section 4 presents
the obtained results. Finally, Section 5 concludes the paper.
A schematic representation of our method is depicted in Figure 2. Given different training sets of low resolution
fundus images and their manual annotations, a grid search approach is followed to find the optimal configuration
of feature parameters for each of their resolutions (Section 2.1.1). Subsequently, structural parameters such as
Figure 2: Schematic representation of our strategy for segmenting retinal vessels in high resolution images with
automated parameter estimation.
the approximate diameter of the optic disc and the FOV, the calibre of the largest vessel and the ratio between
the FOV diameter and the angle of aperture are measured from a subset of image examples. These values and
the optimal parameters are afterwards used to fit an estimation line using linear regression. To learn the vessel
segmentation model from other high resolution images, the structural parameters are taken from a subset of
these other images, and they are used afterwards to automatically adjust the parameters of the selected features
to this new resolution (Section 2.2). Finally, the features are extracted and used for training our supervised
segmentation approach (Section 2.1.2), which is applied in test time to segment the vasculature on new images
with unknown annotations.
2.1 Vessel segmentation approach
2.1.1 Feature extraction
As a proof of concept, we have retrieved a set of three different features that are widely applied in the literature
for characterizing pixels belonging to blood vessels: the vessel enhancement technique based on mathematical
morphology by reconstruction proposed by Zana and Klein,12 the 2D Gabor wavelet by Soares et al.,10 and
the multiscale line detectors by Nguyen et al.11 Nevertheless, this approach can be extended to other features
different from those used in this paper. We will briefly describe them in the sequel to analyze their relevant
parameters. The interested reader could refer to their original references and/or to a recently published review13
for further information. Table 1 summarizes the parameters of these features.
The green color band is the one that exhibits the highest contrast between the retinal vasculature and the
remaining anatomical parts of the fundus.13, 21, 22 Hence, all the features are extracted from the inverted version
of this specific channel. Furthermore, a larger angle of aperture is simulated as proposed by Soares et al.10 to
reduce potential artifacts in the borders of the FOV.
The feature based on morphological operations by reconstruction was originally introduced by Zana and
Klein12 for enhancing curvilinear structures on retinal angiographies and remote sensing imagery. It relies on a
series of morphological operations performed at different angles, using a linear structuring element of length l.
By means of the application of openings, top-hats, a Laplacian of Gaussian, an additional opening, a closing and
a last opening, linear connected elements whose curvature varies smoothly along a crest line are retrieved. We
have used our own implementation of the feature, which has been made publicly available.19 The design of its
main parameter, l, was empirically observed to be extremely relevant to achieve a proper vessel enhancement.
It was previously reported that the value of lis correlated with the caliber of the major vessel in the image.
The 2D Gabor wavelets are known by their intrinsic ability to capture oriented features, which become
relevant for characterizing pixels belonging to the retinal vessels.10 We have used the public implementation
provided by Soares et al.10 for extracting this feature. The main parameter of the method is the set of scales a,
which is associated with the multiple calibres of the vessels in the image.
Finally, we also evaluated the application of line detectors, as proposed by Nguyen et al.11 This approach
is based on analyzing the response of the image to a line of length l∈ {1, ..., W }rotated at different angles.
Wcorresponds to the largest length of the analyzed segments, and is a significant parameter of the method.
As considering values of ltoo similar to each other in high resolution images would significantly increase the
size of the feature vector with redundant information, we have restricted the number of potential scales to eight
equidistant lvalues, spanning from 1 to W.
2.1.2 Fully connected CRF model for retinal vessel segmentation
The blood vessel segmentation task is tackled by means of a recently published method based on learning a Fully
Connected Conditional Random Field (FC-CRF) model using a Structured Output Support Vector Machine
(SOSVM). Such an approach has demonstrated to be effective enough for extracting the retinal vasculature,15, 19
and has been also applied in the context of other tasks such as automated glaucoma screening3or red lesion
detection.8The interested reader could refer to the original reference for further details.
Formally, our purpose is to assign a labeling y={yi}to every pixel iin the image I, with yi∈ L ={−1,1},
corresponding -1 to a non-vascular pixel and 1 to a vessel pixel. An estimated segmentation ycan be obtained
by solving:
y= arg min
y∈L E(y|I) (1)
where E(y|I) is a Gibbs energy defined over the cliques of Gfor a given labeling yfor I. This energy is given by:
E(y|I) = X
ψu(yi,xi) + X
ψp(yi, yj,fi,fj) (2)
where xiand fiare the unary and pairwise features, respectively. Unary potentials ψudefine a log-likelihood
over y, and are obtained using a classifier.23 On the contrary, pairwise potentials ψpdefine a similar distribution
but over the interactions between pairs of pixels, as given by CG. This set is defined by the graph connectivity
rule: in our fully connected definition, all the pixels interact with each other.
Unary potentials are obtained as follows:
ψu(yi,xi) = −hwuyi,xii − wβyiβ(3)
where βis a bias constant, and wuyiand wβyiare weight vectors for the features and the bias term, respectively,
associated to the label yi. The unary vector xiis given by an arbitrary combination of features extracted from
the image (in this work, line detectors and 2D Gabor wavelets).
Pairwise potentials are restricted to be a linear combination of Gaussian kernels by the efficient inference
approach by Kr¨ahenb¨uhl and Koltun,23 which is applied to minimize E(y|I). The pairwise energy is given by:
ψp(yi, yj,fi,fj) = µ(yi, yj)
i, f (m)
where each k(m)is a fixed function over an arbitrary feature f(m)(in this work, the response to the vessel
enhancement method by Zana and Klein), wp(m)is a linear combination weight, and µ(yi, yj) is a label com-
patibility function. The Gaussian kernels are used to quantify the similarity of f(m)between neighboring pixels,
Table 1: Parameters to estimate.
Method Reference Identified
Mathematical morphology Zana and Klein, 200112 l
2D Gabor Wavelets Soares et al., 200610 a
Line detectors Nguyen et al., 201311 W
Fully connected CRF Orlando et al., 201415 and 201719 θp
while the compatibility function µpenalizes similar pixels assigned to different labels, and is given by the Potts
model µ(yi, yj)=[yi6=yj].19 The pairwise kernels have the following form:
i, f (m)
j= exp
with piand pjbeing the coordinate vectors of pixels iand j. Including the positions in the pairwise term allows
to increase the effect of close pixel interactions over distant ones. The parameters θpand θ(m)are used to control
the degree of relevance of each kernel in the expression. Hence, if θpincreases, much longer interactions are taken
into account. On the other hand, if θpdecreases, only local neighborhoods affect the result. Likewise, increasing
or decreasing θ(m)will tolerate higher or lower differences on the pairwise feature. As in our previous work, θ(m)
is fixed automatically as the median of a random sample of pairwise distances.19 On the contrary, θpmust be
properly adjusted as it strongly depends on the resolution of the images. We have evaluated if this parameter
can be automatically determined by our estimation strategy, as described in Section 2.2.
The weights for both the unary and the pairwise potentials are learned using a SOSVM, as we have formerly
proposed.15 Further details about this supervised learning approach can be found in the reference.
2.2 Automated adjustment of feature parameters
Table 1 lists all the parameters to estimate: the length lof the structuring elements used by the Zana and Klein
method, the scales aused to compute the 2D Gabor wavelets, the W0,Wand step values for the line detectors,
and the amplitude θpof the fully connected CRF.
As previously mentioned, we used our version of the parameter estimation strategy proposed by Vostatek
et al.20 The original approach consists on first using low resolution images and their corresponding manual
annotations to optimize each parameter by grid search, evaluating each configuration considering a performance
measurement Q. Afterwards, a linear regression model must be learned from pairs of data points consisting
on the optimal parameter value and its angular resolution. Our version of this approach introduces several
modifications to the original pipeline. Vostatek et al. proposed to use the area under the ROC curve20 as Q,
to guide the optimization process. However, it has been previously demonstrated that this metric is affected by
the degree of imbalance in the data.24 Instead, we propose to use the area under the precision/recall curve to
quantify the quality of the feature parameters, which is more appropriate to this type of problems.24, 25 The
nature of the θpparameter forces us to use a different optimization approach, as the FC-CRF have to be trained
for a certain θpvalue and then evaluated in terms of a binary segmentation metric. We performed this task
as follows: given a value of θp, we trained the FC-CRF on the training set and evaluated its contribution to
improving the average F1-score (Section 3.2) in the validation set. This process was repeated for all the θpvalues,
and the parameter that reported the highest F1-score was taken as optimal.
Once the optimal parameters are found, different structural measurements are manually taken from 2 ran-
domly sampled images of each subset in the set used for optimizing parameters. The average for each image pair
is taken as a representative estimator of the other images in the subset. In particular, we have considered:
Largest vessel calibre (in pixels): measured as the average length of 3 profiles manually drawn at
different locations of the largest vessel.
Table 2: Data sets used in our experiments.
Data set
(angle of
Resolution Training
set Test set
DRIVE 45565 ×584 20 images Not used
STARE 35605 ×700 10 images Not used
ARIA 50768 ×576 55 images Not used
CHASEDB1 30999 ×960 8 images Not used
HRF 603504 ×2336 15 images 30 images
Horizontal diameter of the optic disc (in pixels): by measuring the length of a line horizontally drawn
from the left to the right edge of the optic disc.
Width of the FOV (in pixels): obtained automatically from the FOV binary masks.
Angular resolution: taken as the ratio between the width of the FOV and the angle of aperture of the
fundus camera.20
To analyze which of these structural measurements are more suitable for estimating each parameter, different
lines are fitted to this data, and the coefficient of determination R2of each linear regression model is used as
an indicator of the overall model quality.26 The structural measurement that resulted in the highest R2value
is taken as the optimal metric for a given model. Subsequently, this measure is manually taken from any new
image, and the feature parameters are fixed according to the estimation provided by its corresponding model.
Finally, F-tests were also performed to evaluate whether a linear regression model is suitable to perform the
parameter estimation or not.26
3.1 Materials
As previously indicated in Section 2, our approach requires to be trained in two different ways. First, low
resolution data sets are used to optimize feature parameters and to compute the corresponding estimators. Once
these models are learned, a second training set, with approximately the same resolution than the test set, is
needed to finally learn the fully connected CRF model.
To train our parameter estimators, we used the training sets of DRIVE16 and CHASEDB1,27 and two
additional sets sampled from STARE17 and ARIA.28, 29 Afterwards, the data set HRF18 was applied for validating
the complete pipeline. Table 2 summarizes the main characteristics of each database.
DRIVE16 comprises 40 color fundus photographs (7 with pathologies), obtained from a diabetic retinopathy
screening program in Netherlands. The set was originally divided into a training and a test set, each of them
containing 20 images. However, we only used the training set in our experiments. STARE17 contains 20 fundus
images (10 of them with pathologies) commonly used to evaluate vessel segmentation algorithms. As the set
is not divided into a training and a test set, we used the first 10 images to train the linear regression models.
ARIA28, 29 is made up of three different groups of fundus images, 23 taken from patients with age related macular
degeneration, 59 of patients with diabetes and 61 of healthy subjects. We built a training set from ARIA by
extracting the first 8, 23 and 24 images from each subset, respectively. Finally, CHASEDB127 contains 28 fundus
images of children, centered on the optic disc. This set is divided into a training and a test set, each of them
containing 20 and 8 fundus photographs, respectively. We used these last 8 images for training our parameter
HRF18 was used to validate our segmentation approach. It comprises 45 images, 15 of healthy subjects, 15
of patients with diabetic retinopathy and 15 of glaucomatous persons. As this data set was used for evaluating
the full segmentation approach, we divided it as in19 into a training and a test set. The training set is made up
of the first 5 images of each subset, while the remaining 30 images were used for test. Moreover, the training
set was randomly split into a training* and a validation set, with the first one (10 images) used for learning the
CRF model and the second one (5 images) for validating the regularization parameter C.
3.2 Evaluation metrics
Several metrics are used in the literature for evaluating blood vessel segmentation algorithms. In general, most
of them are expressed in terms of sensitivity (Se, also known as recall, Re), specificity (Sp) and precision (P r ),
which are obtained as follows:
Se =Re =T P
T P +F N (6)
Sp =T N
T N +F P (7)
P r =T P
T P +F P (8)
Sensitivity quantifies the ability of the segmentation method to identify the vasculature, while precision measures
how well the method is able to differentiate it with respect to other structures of the fundus. Similarly, specificity
determines the capability of the method to properly distinguish the non-vascular structures.
As previously mentioned in Section 2.2, a metric Qis needed to guide the feature optimization procedure. All
our experiments were performed using the area under the precision/recall curve to quantify feature’s performance.
We choose this evaluation metric as it is appropriate to characterize features in inbalanced problems where the
proportion of positive samples is smaller than the proportion of negative ones.24, 25
To estimate the ability of our method to segment the vessels, we used Se,Sp and P r . Moreover, we included
other global metrics such as the Matthews Correlation Coefficient, the F1-score and the G-mean, which are also
robust under class inbalance.19
The Matthews Correlation Coefficient (MCC)22 compares manual and automated segmentations, and is given
by the equation:
pP×S×(1 S)×(1 P)(9)
where N=T P +T N +F P +F N is the total number of pixels in the image, S= (T P +F N )/N and
P= (T P +F P )/N. It takes values between -1 and +1, where +1 indicates a perfect prediction, 0 is a random
prediction and -1 a segmentation that is exactly the opposite than the true one.
The F1-score19 is defined as the harmonic mean of the P r and Re:
F1-score = 2×P r ×Re
P r +Re .(10)
Its maximum value, 1, corresponds to a perfect segmentation, while its minimum value, 0, corresponds to a
completely wrong detection. This metric is equivalent to the Dice coefficient, which is also widely used for
evaluating segmentation methods.
Finally, the G-mean has a similar behavior than the F1-score, although it is obtained as the geometric mean
of the Se and Sp:
G-mean = pSe ×Sp (11)
4.1 Parameter estimation
Table 4 presents the R2values obtained using each structural measurement for fitting the parameter estimation
models. For the Soares et al. feature we computed three different estimators, one for each of the three scales
a={a1, a2, a3}. These values grow almost linearly with the image resolution, so the resulting model has a high
R2value. On the contrary, predicting θpusing linear regression appeared to be unfeasible, as the obtained R2
Table 3: R2values obtained for each combination of feature parameter and structural measurements. p-values
of the F-tests performed for each learned model are also included.
Parameter l a1a2a3Wθp
Vessel calibre 0.941
Optic disc diameter 0.990
FOV diameter 0.971
Angular resolution 0.973
values are close to 0. This means that the simplest possible model, which is the average of the samples used for
learning the line, performs much better than the estimated model.
When analyzing each structural measurement separately, it is possible to see that the diameter of the optic
disc allows to obtain the best estimations of Wand l, while the calibre of the largest vessel is the best predictor
for a. This results are complementary to those reported by Vostatek et al., who only analyzed the number of
pixels in the ground truth labeling and the angular resolution. The p-values reported by the F-tests performed
for each learned model also support the idea that using lines to estimate feature parameters is valuable, although
not for θp.
Figure 3 illustrates the best parameter estimators for each feature and for θp. Optimal values obtained by
adjusting each parameter on the HRF training set are also included for comparison purposes, although they were
not used to fit the model. It is possible to see that l,aand Wgrows linearly with their corresponding structural
measurement, which justify the usage of linear regression for fitting their values. On the contrary, the optimal θp
values of the low resolution data sets do not change linearly, a setting that explains why they cannot be properly
approximated with linear regression.
4.2 Segmentation results
Segmentation results using our approach with automated feature parameter estimation are given in Table 4.
As mentioned in Section 4.1, estimating θpusing a linear regression model is not feasible due to its non-linear
behavior with respect to the selected structural measurements. Hence, θpwas fixed to the optimal value obtained
according to the validation set sampled from the HRF training set. We also included other works in the literature
that used the same evaluation protocol and/or training and test splits. Vostatek et al.20 reported the performance
obtained by evaluating the supervised method by Soares et al.10 on HRF. Such an approach is based on learning
a Gaussian Mixture Model classifier from the responses to the 2D Gabor wavelets. Vostatek et al. trained this
method using a random sample of 15 images taken from HRF. It is unfeasible to perform an exact comparison
as we have no certainty that the images used for testing are exactly the same than those used to evaluate our
model. However, we also included these results to provide a general idea of the contribution of the fully connected
CRF model with respect to using the original classifier. A series of Wilcoxon signed-ranks hypothesis tests were
performed to compare the results obtained by our method with respect to our previous approach19 and to those
obtained by Odstrˇcil´ık et al.18
As seen in Table 4, our approach consistently perform better than our previous proposal based on scaling
parameters using a compensation factor. The improvements obtained by adapting the feature parameters with
our strategy and the selection of an optimal θpvalue, as measured by all the considered quality metrics, are also
statistically significant. In particular, this strategy is able to achieve consistently higher F1-score (p < 9.2×107),
G-mean (p < 9.2×107) and MCC (p < 9.2×107) values, which corresponds to a general improvement in the
quality of the results. When decomposing these metrics in terms of their individual measurements, it is possible
to see that the Se is significantly improved (p < 9.2×107) by the estimation of the features, a setting that
is related with a better ability to detect thin structures (Figure 4) and to overcome the issues of the original
approach to deal with the bright central reflex in arteries (Figure 5). Moreover, larger Sp (p < 0.0044) and P r
(p < 6×106) values indicate a reduction in the number of false positive detections.
(a) Zana and Klein feature (b) Soares et al. feature
(c) Nguyen et al. feature (d) θpparameter
Figure 3: Best parameter estimators for each feature parameter. Optimal values on HRF are included only for
comparison purposes, but were not used to fit the linear regression model.
Compared to other existing methods, it is worth noting that our approach achieved the highest average
F1-score (p < 4.1×106) and MCC (p < 1.9×106) values. A higher average G-mean was obtained by
the baseline method of Odstrˇcil´ık et al.18 Yet, the difference is not statistically significant (p= 0.23). Such
an approach reported also a higher average Se value than our method, but it is not statistically significant
(p= 0.09). Furthermore, it is important to underline that the method by Odstrˇcil´ık et al. is based on matched
filter responses that are recovered from filters calibrated for this specific data set. In our case, we used an
automated parameter estimation approach that does not require such an intensive calibration. Moreover, our
method achieve higher Sp (p < 0.006) and P r (p < 1.3×105) values, which correspond to a reduction in the
number of false positive detection.
In this paper we have presented an ensemble approach for blood vessel segmentation in high resolution images,
based on automatically estimating feature parameters. In particular, we have integrated a novel strategy for
parameter estimation using linear regression with a fully connected CRF model, which is known to achieve better
Table 4: Results obtained on HRF.
Methods Se Sp Pr F1 G-mean MCC
Odstrcilik et al., 201318 0.7772 0.9652 0.6950 0.7316 0.8657 0.7065
Vostatek et al., 2017 (Soares)20 0.7340 0.9800 - - 0.8481 -
Vostatek et al., 2017 (Sofka)20 0.5830 0.9780 - - 0.7550 -
Orlando et al., 201719 0.7201 0.9713 0.7199 0.7168 0.8361 0.6900
Our approach 0.7669 0.9725 0.7407 0.7503 0.8636 0.7267
(a) (b)
(c) (d)
Figure 4: Detection of thin vessels, as seen on results obtained on image 10 h from HRF. (a, c) Results obtained
using the ρmultiplier. (b, d) Results obtained using our approach.
results than other existing approaches. We have experimentally analyzed different structural measurements of
the images and their potential usage as guidelines to automatically fit a regression line. Our results indicated
that the optic disc diameter is suitable to estimate the parameters of the line detectors and the feature based
on morphology by reconstruction, while the calibre of the major vessel is the best structural measurement to
estimate the scales of the 2D Gabor filter. On the contrary, the experiments made to automatically adjust θp
indicated that this parameter does not scale linearly with respect to the resolution of the images (Figure 3(d)).
When analyzing the optimal θpvalues for each individual data set, we can see that the higher parameters
were assigned to STARE and ARIA, while the lower values corresponded to DRIVE and CHASEDB1. STARE
and ARIA are characterized by serious pathological cases in which large hemorrhages or exudates occur. On
the contrary, images on DRIVE and CHASEDB1 correspond mostly to healthy patients. HRF also contains
pathological images, although lesions are smaller. This might indicate that larger θpvalues are more suitable to
be used to segment images of patients with large pathologies.
When evaluating the segmentation method quantitatively, it was observed that integrating the parameter
estimation approach improved all the evaluation metrics with respect to using the original scaling factor. Fur-
thermore, the comparison to other works showed that our approach performed consistently better than other
existing approaches that were evaluated using a similar training and test split.
In conclusion, it is possible to see that the estimation strategy applied in this context allows to obtain better
(a) (b)
(c) (d)
(e) (f)
Figure 5: Improved segmentation of arteries with bright central reflex, as seen on image 12 h from HRF. (a, c,
e) Results obtained using the ρmultiplier. (b, d, f) Results obtained using our approach.
results in terms of overall quality measurements, with a consistent improvement in the detection of the thinner
vessels and a more appropriate behavior under the presence of bright central reflex. This approach can be
exploited not only for adjusting the parameters of hand crafted features but also to calibrate deep learning based
methods, for instance, which are usually trained using patches whose size depends on the image resolution.30
Segmentation masks and further implementation details are provided in
This work is partially funded by a NVIDIA Hardware Grant and ANPCyT PICT 2014-1730, PICT 2016-0116
and PICT start-up 2015-0006. J.I.O. is funded by a doctoral scholarship granted by CONICET. We would also
like to thank Odstrˇcil´ık et al. for providing us with their segmentations.
[1] Fraz, M. M. et al., “Blood vessel segmentation methodologies in retinal images–a survey,” Computer Methods
and Programs in Biomedicine 108(1), 407–433 (2012).
[2] Mookiah, M. R. K., Acharya, U. R., Chua, C. K., Lim, C. M., Ng, E., and Laude, A., “Computer-aided
diagnosis of diabetic retinopathy: A review,” Computers in Biology and Medicine 43(12), 2136–2155 (2013).
[3] Orlando, J. I., Prokofyeva, E., del Fresno, M., and Blaschko, M., “Convolutional neural network transfer for
automated glaucoma identification,” in [12th International Symposium on Medical Information Processing
and Analysis], 101600U–101600U, International Society for Optics and Photonics (2017).
[4] Prokofyeva, E. and Zrenner, E., “Epidemiology of major eye diseases leading to blindness in Europe: A
literature review,” Ophthalmic Research 47(4), 171–188 (2012).
[5] Abr`amoff, M. D. et al., “Retinal imaging and image analysis,” IEEE Reviews in Biomedical Engineering 3,
169–208 (2010).
[6] Mendonca, A. M., Sousa, A., Mendon¸ca, L., and Campilho, A., “Automatic localization of the optic disc by
combining vascular and intensity information,” Computerized medical imaging and graphics 37(5), 409–417
[7] Gupta, G., Ram, K., Kulasekaran, S., Joshi, N., Sivaprakasam, M., and Gandhi, R., “Detection of retinal
hemorrhages in the presence of blood vessels,” in [Proceedings of the Ophthalmic Medical Image Analysis
First International Workshop, OMIA 2014, Held in Conjunction with MICCAI 2014 ], Chen X, Garvin MK,
L. J., ed., 1, 105–112, Iowa Research Online (2014).
[8] Orlando, J. I., Prokofyeva, E., del Fresno, M., and Blaschko, M. B., “Learning to detect red lesions in fundus
photographs: An ensemble approach based on deep learning,” arXiv preprint arXiv:1706.03008 (2017).
[9] Zheng, Y., Daniel, E., Hunter, A. A., Xiao, R., Gao, J., Li, H., Maguire, M. G., Brainard, D. H., and Gee,
J. C., “Landmark matching based retinal image alignment by enforcing sparsity in correspondence matrix,”
Medical image analysis 18(6), 903–913 (2014).
[10] Soares, J. V. et al., “Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification,”
IEEE Transactions on Medical Imaging 25(9) (2006).
[11] Nguyen, U. T. et al., “An effective retinal blood vessel segmentation method using multi-scale line detection,”
Pattern Recognition 46(3), 703–715 (2013).
[12] Zana, F. and Klein, J.-C., “Segmentation of vessel-like patterns using mathematical morphology and cur-
vature evaluation,” IEEE Transactions on Image Processing 10(7), 1010–1019 (2001).
[13] Orlando, J. I. and del Fresno, M., “Reviewing preprocessing and feature extraction techniques for retinal
blood vessel segmentation in fundus images,” Mec´anica Computacional XXXIII(42), 2729–2743 (2014).
[14] Sofka, M. and Stewart, C. V., “Retinal vessel centerline extraction using multiscale matched filters, confi-
dence and edge measures,” Medical Imaging, IEEE Transactions on 25(12), 1531–1546 (2006).
[15] Orlando, J. I. and Blaschko, M., “Learning fully-connected CRFs for blood vessel segmentation in retinal
images,” in [Medical Imaging Computing and Computer Assisted Intervention], Golland, P., Barillot, C.,
Hornegger, J., and Howe, R., eds., 8149, 634–641, Springer (2014).
[16] Niemeijer, M. et al., “Comparative study of retinal vessel segmentation methods on a new publicly available
database,” in [Medical Imaging 2004], 648–656, International Society for Optics and Photonics (2004).
[17] Hoover, A. et al., “Locating blood vessels in retinal images by piecewise threshold probing of a matched
filter response,” IEEE Transactions on Medical Imaging 19(3), 203–210 (2000).
[18] Odstrˇcil´ık, J., Kolar, R., Budai, A., Hornegger, J., Jan, J., Gazarek, J., Kubena, T., Cernosek, P., Svoboda,
O., and Angelopoulou, E., “Retinal vessel segmentation by improved matched filtering: evaluation on a new
high-resolution fundus image database,” IET Image Processing 7(4), 373–383 (2013).
[19] Orlando, J. I., Prokofyeva, E., and Blaschko, M. B., “A discriminatively trained fully connected conditional
random field model for blood vessel segmentation in fundus images,” IEEE Transactions on Biomedical
Engineering 64(1), 16–27 (2017).
[20] Vostatek, P., Claridge, E., Uusitalo, H., Hauta-Kasari, M., F¨alt, P., and Lensu, L., “Performance compar-
ison of publicly available retinal blood vessel segmentation methods,” Computerized Medical Imaging and
Graphics 55, 2–12 (2017).
[21] Mar´ın, D., Aquino, A., Geg´undez-Arias, M. E., and Bravo, J. M., “A new supervised method for blood
vessel segmentation in retinal images by using gray-level and moment invariants-based features,” IEEE
Transactions on Medical Imaging 30(1), 146–158 (2011).
[22] Azzopardi, G., Strisciuglio, N., Vento, M., and Petkov, N., “Trainable cosfire filters for vessel delineation
with application to retinal images,” Medical image analysis 19(1), 46–57 (2015).
[23] Kr¨ahenb¨uhl, P. and Koltun, V., “Efficient inference in fully connected CRFs with Gaussian edge potentials,”
in [Advances in Neural Information Processing Systems], 109–117 (2012).
[24] Saito, T. and Rehmsmeier, M., “The precision-recall plot is more informative than the roc plot when
evaluating binary classifiers on imbalanced datasets,” PloS one 10(3), e0118432 (2015).
[25] Lo Vercio, L., Orlando, J. I., del Fresno, M., and Larrabide, I., “Assessment of image features for vessel wall
segmentation in intravascular ultrasound images,” International Journal of Computer Assisted Radiology
and Surgery 11(8), 1397–1407 (2016).
[26] Lomax, R. G. and Hahs-Vaughn, D. L., [Statistical concepts: A second course], Routledge (2013).
[27] Fraz, M. M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rudnicka, A. R., Owen, C. G., and Barman,
S. A., “An ensemble classification-based approach applied to retinal blood vessel segmentation,” IEEE
Transactions on Biomedical Engineering 59(9), 2538–2548 (2012).
[28] Zheng, Y., Hijazi, M. H. A., and Coenen, F., “Automated disease/no disease grading of age-related macular
degeneration by an image mining approach,” Investigative Ophthalmology & Visual Science 53(13), 8310–
8318 (2012).
[29] Fumero, F., Sigut, J., Alay´on, S., Gonz´alez-Hern´andez, M., and Gonz´alez, M., “Interactive tool and database
for optic disc and cup segmentation of stereo and monocular retinal fundus images,” in [Short Papers
Proceedings–WSCG 2015], 91–97 (2015).
[30] Liskowski, P. and Krawiec, K., “Segmenting retinal blood vessels with deep neural networks,” IEEE Trans-
actions on Medical Imaging 35(11), 2369–2380 (2016).
... Over the past two decades, a tremendous amount of research has been devoted in segmenting the vessels from retinal fundus images. Numerous fully automated methods [24,14,17] have been proposed in literature which were quite successful in achieving segmentation accuracy on par with trained human annotators. Despite this, there is a considerable method for further improvements due to various challenges posed by the complex nature of vascular structures. ...
... We directly use the segmented images to compute the metrics. On the other hand, the results for comparison on HRF are gotten from the work [17]. Since they do not provide the source code and the result images, as a result, we simply copy the metrics provided in their paper. ...
We develop a connection sensitive attention U-Net(CSAU) for accurate retinal vessel segmentation. This method improves the recent attention U-Net for semantic segmentation with four key improvements: (1) connection sensitive loss that models the structure properties to improve the accuracy of pixel-wise segmentation; (2) attention gate with novel neural network structure and concatenating DOWN-Link to effectively learn better attention weights on fine vessels; (3) integration of connection sensitive loss and attention gate to further improve the accuracy on detailed vessels by additionally concatenating attention weights to features before output; (4) metrics of connection sensitive accuracy to reflect the segmentation performance on boundaries and thin vessels. Our method can effectively improve state-of-the-art vessel segmentation methods that suffer from difficulties in presence of abnormalities, bifurcation and microvascular. This connection sensitive loss tightly integrates with the proposed attention U-Net to accurately (i) segment retinal vessels, and (ii) reserve the connectivity of thin vessels by modeling the structural properties. Our method achieves the leading position on DRIVE, STARE and HRF datasets among the state-of-the-art methods.
... Pathological features of retinal fundus images such as optic disc, optic cup, blood vessels, and macula [9] are essential towards diagnosing eye-related diseases. Diabetic retinopathy [12][13]17], an eye disease touted as one of the leading cause of blindness is one of the diagnosis that can be accomplished via segmenting the blood vessels of the retinal fundus image. However, the ambiguity of the retinal fundus images presents a challenge for researcher to accurately segment the blood vessels in the image. ...
... The appearance of the blood vessels in terms of color intensity diverges from different region of the images. Hence, most researches emphasize on the either the extensive preprocessing of the images [10,12,15] or the massive learning datasets of the images [4][5][6][7]19]. This resulted in the methods requires intricate and laborious processing. ...
Full-text available
Retinal fundus image is a crucial tool for ophthalmologists to diagnose eye-related diseases. These images provide visual information of the interior layer of the retina structures such as optic disc, optic cup, blood vessels and macula that can assist ophthalmologist in determining the health of an eye. Segmentation of blood vessels in fundus images is one of the most fundamental phase in detecting diseases such as diabetic retinopathy. However, the ambiguity of the retina structures in the retinal fundus images presents a challenge for researcher to segment the blood vessels. Extensive pre-processing and training of the images is necessary for precise segmentation, which is very intricate and laborious. This paper proposes the implementation of object-oriented-based metadata (OOM) structures of each pixel in the retinal fundus images. These structures comprise of additional metadata towards the conventional red, green, and blue data for each pixel within the images. The segmentation of the blood vessels in the retinal fundus images are performed by considering these additional metadata that enunciates the location, color spaces, and neighboring pixels of each individual pixel. From the results, it is shown that accurate segmentation of retinal fundus blood vessels can be achieved by purely employing straightforward thresholding method via the OOM structures without extensive pre-processing image processing technique or data training.
The eye affords a unique opportunity to inspect a rich part of the human microvasculature non-invasively via retinal imaging. Retinal blood vessel segmentation and classification are prime steps for the diagnosis and risk assessment of microvascular and systemic diseases. A high volume of techniques based on deep learning have been published in recent years. In this context, we review 158 papers published between 2012 and 2020, focussing on methods based on machine and deep learning (DL) for automatic vessel segmentation and classification for fundus camera images. We divide the methods into various classes by task (segmentation or artery-vein classification), technique class (supervised or unsupervised, deep and non-deep learning, hand-crafted methods) and more specific algorithms (e.g. multiscale, morphology). We discuss advantages and limitations, and include tables summarising results at-a-glance. Finally, we attempt to assess the quantitative merit of DL methods in terms of accuracy improvement compared to other methods. The results allow us to offer our views on the outlook for vessel segmentation and classification for fundus camera images.
Full-text available
Diabetic retinopathy is one of the leading causes of preventable blindness in the world. Its earliest sign are red lesions, a general term that groups both microaneurysms and hemorrhages. In daily clinical practice, these lesions are manually detected by physicians using fundus photographs. However, this task is tedious and time consuming, and requires an intensive effort due to the small size of the lesions and their lack of contrast. Computer-assisted diagnosis of DR based on red lesion detection is being actively explored due to its improvement effects both in clinicians consistency and accuracy. Several methods for detecting red lesions have been proposed in the literature, most of them based on characterizing lesion candidates using hand crafted features, and classifying them into true or false positive detections. Deep learning based approaches, by contrast, are scarce in this domain due to the high expense of annotating the lesions manually. In this paper we propose a novel method for red lesion detection based on combining both deep learned and domain knowledge. Features learned by a CNN are augmented by incorporating hand crafted features. Such ensemble vector of descriptors is used afterwards to identify true lesion candidates using a Random Forest classifier. We empirically observed that combining both sources of information significantly improve results with respect to using each approach separately. Furthermore, our method reported the highest performance on a per-lesion basis on DIARETDB1 and e-ophtha, and for screening and need for referral on MESSIDOR compared to a second human expert. Results highlight the fact that integrating manually engineered approaches with deep learned features is relevant to improve results when the networks are trained from lesion-level annotated data. An open source implementation of our system is publicly available online.
Conference Paper
Full-text available
Most current systems for automated glaucoma detection in fundus images rely on segmentation-based features, which are known to be influenced by the underlying segmentation methods. Convolutional Neural Networks (CNNs) are powerful tools for solving image classification tasks as they are able to learn highly discriminative features from raw pixel intensities. However, their applicability to medical image analysis is limited by the non-availability of large sets of annotated data required for training. In this article we present results of analysis of the viability of using CNNs that are pre-trained from non-medical data for automated glaucoma detection. Two different CNNs, namely OverFeat and VGG-S, were applied to fundus images to generate feature vectors. Preprocessing techniques such as vessel inpainting, contrast-limited adaptive histogram equalization (CLAHE) or cropping around the optic nerve head (ONH) area were explored within this framework to evaluate the improvement in feature discrimination, combined with both 1 and 2 regularized logistic regression models. Results on the Drishti-GS1 dataset, evaluated in terms of area under the average ROC curve, suggests the viability of this approach and offer significant evidence of the importance of well-chosen image pre-processing for transfer learning when the amount of data is not sufficient for fine-tuning the network.
Full-text available
Goals: In this work, we present an extensive description and evaluation of our method for blood vessel segmentation in fundus images based on a discriminatively trained, fully connected conditional random field model. Methods: Standard segmentation priors such as a Potts model or total variation usually fail when dealing with thin and elongated structures. We overcome this difficulty by using a conditional random field model with more expressive potentials, taking advantage of recent results enabling inference of fully connected models almost in real-time. Parameters of the method are learned automatically using a structured output support vector machine, a supervised technique widely used for structured prediction in a number of machine learning applications. Results: Our method, trained with state of the art features, is evaluated both quantitatively and qualitatively on four publicly available data sets: DRIVE, STARE, CHASEDB1 and HRF. Additionally, a quantitative comparison with respect to other strategies is included. Conclusion: The experimental results show that this approach outperforms other techniques when evaluated in terms of sensitivity, F1-score, G-mean and Matthews correlation coefficient. Additionally, it was observed that the fully connected model is able to better distinguish the desired structures than the local neighborhood based approach. Significance: Results suggest that this method is suitable for the task of segmenting elongated structures, a feature that can be exploited to contribute with other medical and biological applications.
Full-text available
Binary classifiers are routinely evaluated with performance measures such as sensitivity and specificity, and performance is frequently illustrated with Receiver Operating Characteristics (ROC) plots. Alternative measures such as positive predictive value (PPV) and the associated Precision/Recall (PRC) plots are used less frequently. Many bioinformatics studies develop and evaluate classifiers that are to be applied to strongly imbalanced datasets in which the number of negatives outweighs the number of positives significantly. While ROC plots are visually appealing and provide an overview of a classifier's performance across a wide range of specificities, one can ask whether ROC plots could be misleading when applied in imbalanced classification scenarios. We show here that the visual interpretability of ROC plots in the context of imbalanced datasets can be deceptive with respect to conclusions about the reliability of classification performance, owing to an intuitive but wrong interpretation of specificity. PRC plots, on the other hand, can provide the viewer with an accurate prediction of future classification performance due to the fact that they evaluate the fraction of true positives among positive predictions. Our findings have potential implications for the interpretation of a large number of studies that use ROC plots on imbalanced datasets.
Full-text available
Several ophthalmological and cardiovascular diseases–such as diabetic and hypertensive retinopathies, choroidal neovascularization, arteriosclerosis, among others–can be diagnosed by analysing the structure of the retinal vasculature. Such analysis require to count with precise segmentation of blood vessels, being manual delineation tedious and time-consuming. Various algorithms for automatic blood vessel segmentation have been proposed in the last years, most based on supervised methods. These approaches deal with the automatic detection of retinal blood vessel features and non-vessel features by learning on the basis of a training set of manually segmented reference images. Performance of such methods is usually determined by the features capability of discriminating vessels from other anatomi-cal or pathological structures. In this work, we present a review of different preprocessing and feature extraction techniques for blood vessel segmentation in retinal images. Using a linear Support Vector Ma-chine as the segmentation approach, we study the behaviour of several state-of-the-art preprocessing and feature extraction techniques in the detection of retinal vasculature, summarizing their computation and results. Finally, we propose a standard methodology to evaluate and compare blood vessel segmentation algorithms. A publicly available data set of fundus images is employed for evaluation, and our results are compared against other state-of-the-art approaches.
Retinal blood vessel structure is an important indicator of many retinal and systemic diseases, which has motivated the development of various image segmentation methods for the blood vessels. In this study, two supervised and three unsupervised segmentation methods with a publicly available implementation are reviewed and quantitatively compared with each other on five public databases with ground truth segmentation of the vessels.Each method is tested under consistent conditions with two types of preprocessing, and the parameters of the methods are optimized for each database. Additionally, possibility to predict the parameters of the methods by the linear regression model is tested for each database. Resolution of the input images and amount of the vessel pixels in the ground truth are used as predictors.The results show the positive influence of preprocessing on the performance of the unsupervised methods. The methods show similar performance for segmentation accuracy, with the best performance achieved by the method by Azzopardi et al. (Acc 94.0) on ARIADB, the method by Soares et al. (Acc 94.6, 94.7) on CHASEDB1 and DRIVE, and the method by Nguyen et al. (Acc 95.8, 95.5) on HRF and STARE. The method by Soares et al. performed better with regard to the area under the ROC curve. Qualitative differences between the methods are discussed. Finally, it was possible to predict the parameter settings that give performance close to the optimized performance of each method.
The condition of the vascular network of human eye is an important diagnostic factor in ophthalmology. Its segmentation in fundus imaging is a nontrivial task due to variable size of vessels, relatively low contrast, and potential presence of pathologies like microaneurysms and hemorrhages. Many algorithms, both unsupervised and supervised, have been proposed for this purpose in the past. We propose a supervised segmentation technique that uses a deep neural network trained on a large (up to 400; 000) sample of examples preprocessed with global contrast normalization, zero-phase whitening, and augmented using geometric transformations and gamma corrections. Several variants of the method are considered, including structured prediction, where a network classifies multiple pixels simultaneously. When applied to standard benchmarks of fundus imaging, the DRIVE, STARE, and CHASE databases, the networks significantly outperform the previous algorithms on the area under ROC curve measure (up to > 0:99) and accuracy of classification (up to > 0:97). The method is also resistant to the phenomenon of central vessel reflex, sensitive in detection of fine vessels (sensitivity > 0:87), and fares well on pathological cases.
Background: Intravascular ultrasound (IVUS) provides axial gray-scale images, allowing the assessment of the vessel wall and the surrounding tissues. Several studies have described automatic segmentation of the luminal boundary and the media-adventitia interface by means of different image features. Purpose: The aim of the present study is to evaluate the capability of some of the most relevant state-of-the-art image features for segmenting IVUS images. The study is focused on Volcano 20 MHz frames not containing plaque or containing fibrotic plaques, and, in principle, it could not be applied to frames containing shadows, calcified plaques, bifurcations and side vessels. Methods: Several image filters, textural descriptors, edge detectors, noise and spatial measures were taken into account. The assessment is based on classification techniques previously used for IVUS segmentation, assigning to each pixel a continuous likelihood value obtained using Support Vector Machines (SVMs). To retrieve relevant features, sequential feature selection was performed guided by the area under the Precision-Recall curve (AUC-PR). Results: Subsets of relevant image features for lumen, plaque and surrounding tissues characterization were obtained, and SVMs trained with these features were able to accurately identify those regions. The experimental results were evaluated with respect to ground truth segmentations from a publicly available dataset, reaching values of AUC-PR up to 0.97 and Jaccard index close to 0.85. Conclusion: Noise-reduction filters and Haralick’s textural features denoted their relevance to identify lumen and background. Laws’ textural features, Local Binary Patterns, Gabor filters and edge detectors had less relevance in the selection process.