ArticlePDF Available

Automated segmentation by pixel classification of retinal layers in ophthalmic OCT images

Optica Publishing Group
Biomedical Optics Express
Authors:

Abstract and Figures

Current OCT devices provide three-dimensional (3D) in-vivo images of the human retina. The resulting very large data sets are difficult to manually assess. Automated segmentation is required to automatically process the data and produce images that are clinically useful and easy to interpret. In this paper, we present a method to segment the retinal layers in these images. Instead of using complex heuristics to define each layer, simple features are defined and machine learning classifiers are trained based on manually labeled examples. When applied to new data, these classifiers produce labels for every pixel. After regularization of the 3D labeled volume to produce a surface, this results in consistent, three-dimensionally segmented layers that match known retinal morphology. Six labels were defined, corresponding to the following layers: Vitreous, retinal nerve fiber layer (RNFL), ganglion cell layer & inner plexiform layer, inner nuclear layer & outer plexiform layer, photoreceptors & retinal pigment epithelium and choroid. For both normal and glaucomatous eyes that were imaged with a Spectralis (Heidelberg Engineering) OCT system, the five resulting interfaces were compared between automatic and manual segmentation. RMS errors for the top and bottom of the retina were between 4 and 6 μm, while the errors for intra-retinal interfaces were between 6 and 15 μm. The resulting total retinal thickness maps corresponded with known retinal morphology. RNFL thickness maps were compared to GDx (Carl Zeiss Meditec) thickness maps. Both maps were mostly consistent but local defects were better visualized in OCT-derived thickness maps.
Content may be subject to copyright.
Automated segmentation by pixel
classification of retinal layers in
ophthalmic OCT images
K. A. Vermeer,1,J. van der Schoot,1,2H. G. Lemij,2and
J. F. de Boer1,3,4
1Rotterdam Ophthalmic Institute, Rotterdam Eye Hospital, P.O. Box 70030, 3000 LM
Rotterdam, The Netherlands
2Glaucoma Service, Rotterdam Eye Hospital, P.O. Box 70030, 3000 LM Rotterdam, The
Netherlands
3Dept. of Physics and Astronomy, VU University, De Boelelaan 1081, 1081 HV Amsterdam,
The Netherlands
4LaserLaB Amsterdam, VU University, De Boelelaan 1081, 1081 HV Amsterdam, The
Netherlands
*k.vermeer@eyehospital.nl
Abstract: Current OCT devices provide three-dimensional (3D) in-vivo
images of the human retina. The resulting very large data sets are difficult
to manually assess. Automated segmentation is required to automatically
process the data and produce images that are clinically useful and easy to
interpret. In this paper, we present a method to segment the retinal layers
in these images. Instead of using complex heuristics to define each layer,
simple features are defined and machine learning classifiers are trained
based on manually labeled examples. When applied to new data, these clas-
sifiers produce labels for every pixel. After regularization of the 3D labeled
volume to produce a surface, this results in consistent, three-dimensionally
segmented layers that match known retinal morphology. Six labels were
defined, corresponding to the following layers: Vitreous, retinal nerve fiber
layer (RNFL), ganglion cell layer & inner plexiform layer, inner nuclear
layer & outer plexiform layer, photoreceptors & retinal pigment epithelium
and choroid. For both normal and glaucomatous eyes that were imaged
with a Spectralis (Heidelberg Engineering) OCT system, the five resulting
interfaces were compared between automatic and manual segmentation.
RMS errors for the top and bottom of the retina were between 4 and 6µm,
while the errors for intra-retinal interfaces were between 6 and 15µm.
The resulting total retinal thickness maps corresponded with known retinal
morphology. RNFL thickness maps were compared to GDx (Carl Zeiss
Meditec) thickness maps. Both maps were mostly consistent but local
defects were better visualized in OCT-derived thickness maps.
© 2012 Optical Society of America
OCIS codes: (100.0100) Image processing; (100.2960) Image analysis; (100.5010) Pat-
tern recognition; (170.4470) Ophthalmology; (170.4500) Optical coherence tomography;
(170.4580) Optical diagnostics for medicine.
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1743
References and links
1. N. Nassif, B. Cense, B. Park, M. Pierce, S. Yun, B. Bouma, G. Tearney, T. Chen, and J. de Boer, “In vivo high-
resolution video-rate spectral-domain optical coherence tomography of the human retina and optic nerve,” Opt.
Express 12, 367–376 (2004).
2. T. Fabritius, S. Makita, M. Miura, R. Myllyla, and Y. Yasuno, “Automated segmentation of the macula by optical
coherence tomography,” Opt. Express 17, 15659–15669 (2009).
3. A. Mishra, A. Wong, K. Bizheva, and D. A. Clausi, “Intra-retinal layer segmentation in optical coherence tomog-
raphy images,” Opt. Express 17, 23719–23728 (2009).
4. V. Kaji´
c, B. Povazay, B. Hermann, B. Hofer, D. Marshall, P. L. Rosin, and W. Drexler, “Robustsegmentation of
intraretinal layers in the normal human fovea using a novel statistical model based on texture and shape analysis,
Opt. Express 18, 14730–14744 (2010).
5. M. K. Garvin, M. D. Abramoff, X. Wu, S. R. Russell, T. L. Burns, and M. Sonka, “Automated 3-D intrareti-
nal layer segmentation of macular spectral-domain optical coherence tomography images,” IEEE Trans. Med.
Imaging 28, 1436–1447 (2009).
6. S. J. Chiu, X. T. Li, P. Nicholas, C. A. Toth, J. A. Izatt, and S. Farsiu, “Automatic segmentation of seven retinal
layers in SDOCT images congruent with expert manual segmentation,” Opt. Express 18, 19413–19428 (2010).
7. M. Mujat, R. Chan, B. Cense, B. Park, C. Joo, T. Akkin, T. Chen, and J. de Boer, “Retinal nerve fiber layer
thickness map determined from optical coherence tomography images,” Opt. Express 13, 9480–9491 (2005).
8. Q. Yang, C. A. Reisman, Z. Wang, Y. Fukuma, M. Hangai, N. Yoshimura, A. Tomidokoro, M. Araie, A. S. Raza,
D. C. Hood, and K. Chan, “Automated layer segmentation of macular OCT images using dual-scale gradient
information,” Opt. Express 18, 21293–21307 (2010).
9. H. Zhu, D. P. Crabb, P. G. Schlottmann, T. Ho, and D. F. Garway-Heath, “FloatingCanvas: quantification of 3D
retinal structures from spectral-domain optical coherence tomography,” Opt. Express 18, 24595–24610 (2010).
10. R. J. Zawadzki, A. R. Fuller, D. F. Wiley, B. Hamann, S. S. Choi, and J. S. Werner, Adaptation of a support
vector machine algorithm for segmentation and visualization of retinal structures in volumetric optical coherence
tomography data sets,” J. Biomed. Opt. 12, 041206 (2007).
11. P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features, IEEE CVPR 1,
511–518 (2001).
12. V. N. Vapnik, The Nature of Statistical Learning Theory (Springer-Verlag, 1995).
13. V. N. Vapnik, Estimation of Dependences Based on Empirical Data (Springer-Verlag, 1982).
14. C. Cortes and V. N. Vapnik, “Support-vector networks, Mach. Learn. 20, 273–297 (1995).
15. M. Aizerman, E. Braverman, and L. Rozonoer, “Theoretical foundations of the potential function method in
pattern recognition learning,” Autom. Rem. Control 25, 821–837 (1964).
16. Y.-W. Chang, C.-J. Hsieh, K.-W. Chang, M. Ringgaard, and C.-J. Lin, “Training and testing low-degree polyno-
mial data mappings via linear SVM,” J. Mach. Learn. Res. 11, 1471–1490 (2010).
17. M. Kass, A. Witkin, and D. Terzopoulos, “Snakes - Active Contour Models, Int. J. Comput. Vision 1, 321–331
(1987).
18. S. Osher and N. Paragios, Geometric Level Set Methods in Imaging, Vision, and Graphics (Springer-Verlag,
2003).
19. B. Cense, T. C. Chen, B. H. Park, M. C. Pierce, and J. F. de Boer, “Thickness and birefringence of healthy retinal
nerve fiber layer tissue measured with polarization-sensitive optical coherence tomography,” Invest. Ophthamol.
Vis. Sci. 45, 2606–2612 (2004).
20. X.-R. Huang, H. Bagga, D. S. Greenfield, and R. W. Knighton, “Variation of peripapillary retinal nerve fiber
layer birefringence in normal human subjects,” Invest. Ophthamol. Vis. Sci. 45, 3073–3080 (2004).
21. H. Ishikawa, J. Kim, T. R. Friberg, G. Wollstein, L. Kagemann, M. L. Gabriele, K. A. Townsend, K. R. Sung, J. S.
Duker, J. G. Fujimoto, and J. S. Schuman, “Three-dimensional optical coherence tomography (3d-oct) image
enhancement with segmentation-free contour modeling c-mode,” Invest. Ophthamol. Vis. Sci. 50, 1344–1349
(2009).
22. S. Jiao, R. Knighton, X. Huang, G. Gregori, and C. Puliafito, “Simultaneous acquisition of sectional and fundus
ophthalmic images with spectral-domain optical coherence tomography, Opt. Express 13, 444–452 (2005).
23. E. C. Lee, J. F. de Boer, M. Mujat, H. Lim, and S. H. Yun, “In vivo optical frequency domain imaging of human
retina and choroid,” Opt. Express 14, 4403–4411 (2006).
1. Introduction
Current spectral domain Optical Coherence Tomography (OCT), as implemented by various
manufacturers for ophthalmic applications, produces high quality data at a high speed. Typi-
cally, around 15,000–40,000A-lines (depth scans at a single location) are produced per second.
This high speed enables the acquisition of three-dimensional or volumetric data sets in a short
period of time [1]. Combined with some method for motion correction, densely sampled vol-
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1744
umes may be produced with increased signal-to-noise ratio and reduced speckle due to averag-
ing, while the acquisition time (including alignment and focusing) is just a few minutes, even
in pathological eyes.
Densely sampling a volume results in large data sets (in the order of 50 million pixels) which
are difficult to quickly analyze in a clinical setting. Adequately representing three-dimensional
data on a display is challenging and the available time for review is very limited. While the
data contains a large amount of information, often much of it is irrelevant for the specific task
at hand, such as glaucoma detection or monitoring. Reducing the data to a much smaller and
easier to interpret set, still containing most relevant information, is therefore vital for routine
clinical use. In addition, reducing the data is required for most tasks related to computer-aided
diagnosis.
In most cases, segmentation of the data is a prerequisite for performing the mentioned data
reduction. In the case of retinal OCT images, this means that the layers of the retina have to
be labeled automatically. Several methods for segmentation of OCT data have been described
previously and are briefly reviewed below.
Fabritius et al. proposed a straight-forward intensity based method to find only the inner
limiting membrane and the retinal pigment epithelium (RPE), providing a three-dimensional
segmentation of the retina [2]. Mishra et al. employed a two-step method based on a kernel-
based optimization scheme to segment B-scans of rodent retinas [3]. Kaji´
c et al. introduced
a modification of an active appearance model for segmenting macular OCT data, which was
trained on a large number of manually labeled example images [4]. Garvin et al. introduced
graph cuts to localize the boundary between layers, which are guaranteed to find the global
optimum [5]. Weights were defined heuristically and shape models were derived from train-
ing data. Instead of graph cuts, Chiu et al. used dynamic programming techniques as a more
efficient technique to localize various retinal layer interfaces in single B-scans [6].
A significant portion of those employ a segmentation per two-dimensional OCT frame [3,6],
sometimes subsequently combined to a three-dimensional result by post-processing [4, 7, 8],
while others offer a true three-dimensional segmentation [2,5,9].
In this paper, we present a method for three-dimensional retinal layer segmentation in OCT
images by a flexible method that learns from provided examples. Parts of representative OCT
scans were manually segmented and used by the algorithms to learn from. Learning and classi-
fication of pixels was done by a support vector machine, although many other machine learning
classifiers may be employed alternatively. Smoothness of the detected interfaces was guaran-
teed by the level set regularization that was applied after the pixel classification. The procedure
was the same for all layers, except for the manually segmented data used to train the classifier.
Similar to Zawadzki et al. [10], a support vector machine is used to classify pixels in the
OCT image, but we aim to do this in a fully automated way. Like in other previous work [4,5],
our method uses example data (manually labeled scans) to learn from. However, we do not put
strong restrictions on the shape of the interfaces or layers, because the segmentation should also
be applicable to atypical (i.e., diseased) retinas, which shape is not represented by the learning
data set.
Results on different interfaces between layers are presented and evaluated for both normal
and glaucomatous eyes. Validation was performed by comparing the automatically and manu-
ally segmented interfaces. Further processing of the algorithms’ outcome produced thickness
maps of single or combined layers, which can be used for clinical assessment of the pathology
of the imaged eye, and retinal and choroidal blood vessel images.
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1745
pixel Calculate
Classify
label
feature
vector
Fig. 1. Overview of the feature calculation and classification process. Every A-line is pro-
cessed to produce averages and gradients at different scales, thereby transforming every
pixel into a feature vector. The classifier calculates the label based on the feature vector,
resulting in a labeled A-line.
2. Methods
The full algorithm is comprised of three steps: defining features, classifying pixels and per-
forming regularization. Each of these steps are further explained in this section. In the learning
stage, the features are defined and the classifier is trained. No regularization is required in this
case. An overview of the first two steps is given in Fig. 1. Note that features are defined based
on individual A-lines, resulting in a feature vector for each pixel. These pixels are then individ-
ually classified and finally the labels in the whole volume are regularized to produce a smooth
interface.
When determining multiple interfaces, the described method is applied subsequently to each
interface of interest. On the retinal OCT scans, a hierarchical approach is taken: first, the top
and bottom of the retina are detected and then the intra-retinal interfaces are localized. These
intra-retinal interfaces are forced to lie within the retina. However, the order of intra-retinal
interfaces is not enforced.
2.1. Features
Classification of pixels is generally done based on one or more features of these pixels. In OCT
data, the most basic feature is simply the value produced by the OCT measurement. However,
given that a backscatter value is not specific for any tissue, the data cannot be segmented based
on only that. For example, both the RNFL and the RPE are strongly backscattering layers in
the retina.
Our features are defined based on two observation. First, as explained above, incorporating
only the pixel value itself is insufficient. Instead, data from pixels above and below the current
one should be incorporated as well. Second, an interface is often delineated by an increase or
decrease of the OCT signal, resulting in an intensity edge in the B-scan. We chose to define
features based on individual A-lines. This enables the use of the same features (and therefore
classifiers) irrespective of the scan protocol (i.e., the number of A-lines per B-scan, or the
number of B-scans per volume). Following this reasoning, we used one dimensional Haar-
like features [11]. We incorporated averages and gradients, both on different scales. Haar-like
features were chosen over, for example, Gaussian averages and differences, because of their
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1746
Fig. 2. Features calculated for each pixel on an A-line, at different scales. The A-line is dis-
played in grayscale in the background of each graph. The features are defined by averages
(red lines) and gradients (green lines).
fast implementation by means of lookup tables.
Let the intensity along an A-line be denoted by fx,y(z), where xand yare the lateral coordi-
nates of the A-line and zis the depth or distance in axial direction. In the remainder, we will
skip the lateral coordinates and simply write f(z). Then the first feature, g0, is simply fitself:
g0(z) = f(z).(1)
Next, the averages gdat scale dare defined by simply averaging 2dpixels centered on f
gd(z) = 1
2d
1+2d1
z=12d1
f(z+z).(2)
Similarly, the gradient h0is calculated by
h0(z) = f(z+1)f(z) = g0(z+1)g0(z)(3)
and the gradients hdat scale dare defined by
hd(z) = gd(z+2d1)gd(z2d1).(4)
Based on these features, we define the full feature vector x(z)up to scale dfor each pixel as
x(z) = [g0(z),h0(z),g1(z),h1(z),..., gd(z),hd(z)].(5)
An example of calculating these features for an A-line is shown in Fig. 2. Similarly, the result
for a full B-scan is shown in Fig. 3. In this figure and in the overview of Fig. 1, only 4 scales for
both averages and gradients are indicated. However, in our experiments, 8 scales of both types
were used (as in Fig. 2).
2.2. Classification
A classifier produces a label for each input or feature vector x. A specific type of a classifier
is the support vector machine (SVM) [12]. During training, it aims at creating a maximum
margin between the classification boundary and the samples closest to this boundary [13,14].
When given a new, unlabeled feature vector x, the SVM evaluates
s(x) = hw,xi+b(6)
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1747
and uses the sign of s(x)to produce the label. Here, h·,·idenotes the inner product, wdenotes
the normal of the (linear) classification boundary and bis some offset. In the training stage, w
is defined as a weighted sum of the training samples xi. Due to the way SVMs are optimized,
many of the weights may go to zero and effectively only a relatively small number of samples,
the support vectors will be used to define w. Equation (6) may now be rewritten as
s(x) =
N
i=1
α
iyihxi,xi+b,(7)
where
α
idenotes the weight of training sample xiand yidenotes its corresponding label (±1).
The classifier of Eq. (7) is a linear classifier, given that its result is a linear combination of the
inner product of the feature vector xand the support vectors. However, by replacing the inner
product by a kernel K(·,·)[15], a non-linear SVM is constructed:
s(x) =
N
i=1
α
iyiK(xi,x) + b.(8)
Implicitly, the kernel maps the input features into a possibly very high dimensional space.
In this feature space, a linear classification is performed. Various kernels may be used, such
as polynomial kernels or radial basis functions. In the latter case, the kernel maps the input
features into an infinite dimensional space, giving highly non-linear classification boundaries.
With polynomial kernels, the dimension of the feature space is better controlled.
In general, the kernel-SVM, given by Eq. (8), cannot be rewritten as an explicit linear func-
tion as in Eq. (6) or (7). The disadvantage of the implicit form of Eq. (8) is that it requires the
storage of all support vectors and, for every new sample, it needs to calculate the kernel for
each support vector.
In some cases, however, the kernel can be written explicitly. This applies, for example, to the
polynomial kernel K(xi,xj)=(xi·xj+1)d, where dis the degree of the polynomial. For higher
order polynomial kernels, the resulting mapping results in a highly dimensional feature space,
but for lower order kernels (degree 2 and possibly 3), explicit calculation is feasible. This is
done by writing the kernel as an inner product of a mapping
φ
(·):
K(xi,xj) =
φ
(xi),
φ
(xj).(9)
For example, for a polynomial kernel of degree 1, the corresponding mapping is simply
φ
(x) =
(1,x1,..., xn)T, where xiis the i-th element of vector x, containing nelements. In a similar way,
Scale 21Scale 25
Averages
Gradients
Scale 23Scale 27
1 mm
Fig. 3. Graphical representation of the features for each pixel in a B-scan. The feature
vector for each pixel consists of one matching pixel from each of the processed B-scans.
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1748
explicit mappings may be found for polynomial kernels in general [16]. If such an explicit
mapping
φ
(·)exists, Eq. (8) may be rewritten as
s(x) =
N
i=1
α
iyiK(xi,x) + b=
N
i=1
α
iyih
φ
(xi),
φ
(x)i+b=hw,
φ
(x)i+b,(10)
yielding a similar result as Eq. (6). As a result, wmay now be precomputed and for new data x
only the mapping
φ
(x)and its inner product with wneeds to be calculated.
In our application, a polynomial kernel of degree 2 was chosen, with the corresponding
mapping
φ
(x) = 1,2x1,...,2xn,x1x2,..., xn1xn,x2
1,..., x2
n(11)
which transformed vector xfrom an n-dimensional space into an (n+2)(n+1)
/2-dimensional
space. By precomputing w, storing all support vectors was no longer required. In addition,
calculation was much faster due to the linear operation of the resulting SVM.
2.3. Regularization
The process of pixel classification leads to a volume of pixels with class labels. These labels
denote that, according to the classification procedure, the pixel is above or below the interface
of interest. The classification result may contain some errors, resulting in incorrectly assigned
labels. In addition, imaging artifacts may lead to misclassified A-lines and registration errors
result in discontinuities in the interface. Simply using every change in label as an interface
would result in a very unrealistic morphology of the layers. Instead, the detected interface will
first be regularized by applying some constraints. By penalizing the curvature of the interface,
its smoothness can be controlled.
One way of doing this is by using level set methods, which provide a non-parametric way to
describe the interface. In contrast with parametric methods, such as snakes [17] that provide an
explicit parametrization of the interface, level sets [18] embed the interface implicitly, which
has some computational advantages (i.e., regarding propagation and topology of the interface).
The level set function
φ
is defined in the same space as the input data (which is three dimen-
sional for volumetric OCT data) and maps an input coordinate xto a scalar. The interface is
then defined as the curve Cfor which the level set is zero: C={x|
φ
(x) = 0}.
The level set is evolved according to the general level set equation
φ
t=F|
φ
|.(12)
Here,
φ
tis the update step of the level set, Fis some force that drives the level set and
φ
is the
gradient of the level set. Adding the smoothness constraint (based on the curvature
κ
, which
may be calculated directly from the level set function by
κ
=·(
φ
/|
φ
|)) and defining Fby
the label field L(x)results in
φ
t=
ακ
|
φ
|
β
L(x)|
φ
|,(13)
where the ratio of
α
and
β
define the relative contributions of both terms. The label field Lis
produced by the classification routine explained in the previous section.
2.4. Data
10 healthy subjects and 8 glaucomatous subjects were included from an ongoing study in the
Rotterdam Eye Hospital. Of each healthy subject, one eye was selected randomly and included
in this study. Subjects with diseased eyes were randomly selected and one moderately glauco-
matous eye was included per subject. Moderate glaucoma was defined by a visual field with a
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1749
(a) (b) (c)
1 mm
Fig. 4. (a) SLO image of the retina indicating the position of a B-scan (red line) and the
total scan area (blue square). (b) OCT B-scan acquired along the red line of Fig. 4(a). (c)
Reconstructed en-face image based on 193 B-scans.
mean defect of -12 -6 dB, as tested on a Humphrey Field Analyzer (Carl Zeiss Meditec, Inc.,
Dublin, CA, USA) with a standard white on white, 24-2 SITA program.
OCT scans of all eyes were acquired with a Spectralis OCT system (Heidelberg Engineering,
Dossenheim, Germany) that simultaneously captures a scanner laser ophthalmoscope (SLO)
image. The scan protocol used acquired A-lines of 496 pixels. 512 A-lines were combined into
a B-scan and the full volumetric scan comprised 193 B-scans. The system employed an eye-
tracker and it was set to acquire and average 5 B-scans to improve the signal-to-noise ratio.
The field-of-view was 20x20° (corresponding to an area of almost 6x6 mm). The resulting
volumetric data is rather anisotropic, with a sampling density of 89x33x259mm1(in x, y, and
z-direction respectively, where x and y are lateral directions, and z is the axial direction). This
results in a pixel size of approximately 11x30x3.9µm.
All analyses were carried out on the raw measurements as produced by the OCT device.
This means that the data was not logarithmically transformed (as is often done when displaying
OCT data) and no additional registration or alignment of the data was performed. (For display
purposes, the printed OCT images in this paper were transformed by 4
·,according to the
manufacturer’s recommendation.)
An example of a B-scan with its corresponding location on the simultaneously acquired
fundus SLO image is shown in Figs. 4(a) and 4(b). Many of these B-scans are acquired within
the outlined area. In Fig. 4(c), the reconstructed en-face image of the OCT volume is shown,
indicating that the OCT data suffers from various artifacts, such as fluctuating brightness and
residual eye movement.
Two B-scans of each healthy subject and one B-scan of each glaucomatous subject were
manually segmented using ITK-SNAP (available at http://www.itksnap.org/). Every
pixel was assigned one of the following labels: Vitreous, RNFL, ganglion cell layer (GCL) &
inner plexiform layer (IPL), inner nuclear layer (INL) & outer plexiform layer (OPL), photore-
ceptors/RPE or choroid. In this way, five interfaces between these structures were defined. At
the location of the optic nerve head (ONH), no labeling was performed because in that area the
retina’s morphology differs from its usual layered arrangement.The ONH was thus excluded
from the training data. An example of a manually labeled B-scan, corresponding to the B-scan
in Fig. 4(b), is shown in Fig. 5 (left).
The vertical location of the manually segmented B-scans was chosen in the following way.
Per hemisphere, five vertical areas were defined: In the ONH, at the edge of the ONH and three
areas of increasing distance from the edge of the ONH. The training set contained two B-scans
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1750
in each of these ten areas, ensuring that the full sampled area was covered.
2.5. Application
One of the applications of OCT is the assessment of glaucoma. In this progressive eye disease,
the RNFL deteriorates and consequently its thickness decreases. Viewing and interpreting a full
3D OCT volume is too time-consuming for integrating in the normal clinical routine. Viewing
and interpreting a 2D thickness map, however, is much quicker. In addition, it may easily be
compared to normative data or previous measurements.
The RNFL thickness map was derived from the segmentation results by measuring the dis-
tance along each A-line from the top RNFL interface to the bottom RNFL interface. For each
A-line, a single distance measure results, which is then combined into a 2D map for the whole
3D data volume.
RNFL thickness maps were also acquired by the GDx (Carl Zeiss Meditec, Inc, USA), a
Scanning Laser Polarimetry (SLP) device. SLP measures the birefringence of the sampled tis-
sue. Since most of the birefringence is attributed to the retinal nerve fiber layer (RNFL), this
produces a measure that is related to the thickness of the RNFL. Assuming (incorrectly [19,20])
that the resulting retardation is proportionally related to the thickness of the tissue, an RNFL
thickness map is produced. The resulting SLP thickness maps were visually compared to the
OCT thickness maps produced by the presented segmentation method.
3. Results
Five interfaces were evaluated, between the vitreous and the RNFL, between the RNFL and the
GCL/IPL, between the IPL and the INL, between the OPL and the ONL and between the RPE
and the choroid. These interfaces defined six different areas, consisting of one or more tissue
layers.
Manual
Automatic
1 mm
Fig. 5. Manually (left) and automatically (right) labeled B-scan of a normal eye. This eye
was excluded from the training data when obtaining the automatic segmentation.
Manual Automatic
1 mm
Fig. 6. Manually (left) and automatically (right) labeled B-scan of a glaucomatous eye. No
glaucomatous eye was included in the training data.
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1751
Table 1. Localization Errors of Automatically Segmented Interfaces Compared to Manual
Segmentations*
Root-mean-square error (µm) Mean absolute deviation (µm)
Interface Normal Glaucoma Normal Glaucoma
Vitreous - RNFL 3.7 5.5 2.7 4.5
RNFL - GCL 15.4 11.5 9.1 8.3
IPL - INL 15.0 9.5 9.2 6.4
OPL - ONL 9.3 5.8 5.8 4.5
RPE - Choroid 5.5 6.2 4.2 4.7
*Error estimates for normal eyes are based on cross-validation.
Because the ONH was excluded from the training data, the results of the automatic segmen-
tation at the ONH’s position are invalid. In all numeric evaluations of the algorithm, the area of
the ONH was excluded. However, in the displayed figures, the papillary area is not masked out.
An example of the automatic segmentation result on a normal eye is shown in Fig. 5, next
to the corresponding manually labeled B-scan. Examples of a glaucomatous eye are shown in
Fig. 6.
3.1. Accuracy
For each interface, the accuracy was determined by comparing the results of the automatic al-
gorithm with the manually segmented data. Regularization in our method works in 3D, while
manual segmentation was done on a single B-scan only. Because B-scans were not realigned
(in depth or laterally) before analysis, misaligned B-scans did not match with the automati-
cally segmented interfaces due to the restrictions placed on the interfaces’ 3D shapes by the
regularization procedure. Manual segmentation, on the other hand, did not incorporate data of
adjacent B-scans. Alignment errors of B-scans within the full 3D scan could thus have resulted
in increased reported errors.
An overview of the results of the automatic segmentation is shown in Table 1. For evaluation
of the errors on the scans of normal subjects, a leave-one-out cross-validation was performed.
Considering all B-scans independently in the cross-validation procedure would introduce a bias,
due to the correlation of B-scans of the same eye. Instead, cross-validation was implemented
by repeatedly removing both scans of one eye from the training data set, evaluating the result
on those B-scans of the excluded eye and finally averaging the resulting error over all 10 rep-
etitions. Estimation of the error on the glaucomatous eyes was done by training the algorithm
on all normal eyes and using the glaucomatous eyes only for error assessment.
For comparison with other methods, two error measures were evaluated: the root-mean-
square error and the mean absolute deviation. For our data, an error of 3.9µm corresponded
to an axial error of one pixel.
3.2. Processing time
The processing times for classification (including calculation of the features) and regulariza-
tion were recorded on a Intel® Core™ 2 CPU (Q6600), running at 2.4 GHz and containing
4 GB of memory. While this is a multi-core processor, the calculations were not parallelized.
The routines were implemented in Matlab (Mathworks, Natick, MA, USA) and used the SVM
and level set methods from LIBLINEAR (a library for large linear classification, available at
http://www.csie.ntu.edu.tw/˜cjlin/liblinear/) and ITK (Insight Segmen-
tation and Registration Toolkit, available at http://www.itk.org/). Time spent on load-
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1752
ing the data sets and some other minor overhead was excluded. The processing times are listed
in Table 2. For the three inner retinal interfaces, the reported times do not include the processing
time for the outer interfaces.
3.3. Thickness maps
Thickness maps were produced for the full retina to visually validate their correspondence to the
known retinal morphology. Thickness maps of only the RNFL, useful for glaucoma diagnosis,
were produced as well. These thickness maps are shown in Fig. 7 for one healthy and two
glaucomatous eyes. For reference, the en-face OCT image is shown in the first column of the
figure. Finally, thickness maps produced by a GDx are shown as well.
The retinal and RNFL thickness of the healthy eye matches known morphology. The retina
is thick near the ONH and also somewhat thicker towards the macula. The RNFL is thicker
superiorly and inferiorly compared to nasally and temporally, which is also reflected in the
total retinal thickness. In the two glaucomatous cases, the RNFL is thinner than in the normal
eye. Local defects may be found in the temporo-superior and temporo-inferior regions for the
first glaucomatous case and temporo-inferiorly for the second glaucomatous case. Compared to
the GDx images, these defects are better visualized in the OCT-derived thickness maps.
3.4. Vessel visualization
For visualization of vessels, the OCT data was averaged over a small region. Instead of inter-
actively, as in the C-mode images by Ishikawa et al. [21], the region was defined automatically
based on the segmentation results. Defining the region as a few pixels above the RPE/choroid
interface, the retinal vessels were visualized very well due to their strong shading (see Fig.
8), similar to Jiao et al [22]. Likewise, averaging the OCT data over a small area below the
RPE/choroid interface showed a pattern that resembles the choroidal vasculature, including
shading of the retinal vessels (see Fig. 9), which was done earlier by Lee et al. [23].
4. Discussion
We have presented a method to automatically label retinal layers in OCT images. Instead of
relying on heuristics to define the boundaries of the layers, a pixel classification was used.
The classifiers were trained on manually segmented data and were based on simple Haar-like
features. The classifier’s performance clearly depends on the type of features that are provided.
However, the classifier decides (based on the training data) which features are selected and how
they are combined. For that process, existing techniques for feature selection and classification
from the pattern recognition and machine learning literature may be applied.
The presented method is very flexible in that it can easily be adapted to other interfaces. For
Table 2. Processing Times (Standard Deviation) of the Classification (Including Calculating
the Features) and Regularization Steps
Classification time (s) Regularization time (s)
Interface Normal Glaucoma Normal Glaucoma
Vitreous - RNFL 82.0 (1.9) 82.7 (2.0) 91 (11) 112 (11)
RNFL - GCL 86.6 (1.4) 84.7 (0.7) 95 (17) 104 (23)
IPL - INL 74.1 (1.4) 73.7 (1.4) 122 (27) 143 (33)
OPL - ONL 74.0 (1.3) 74.3 (1.7) 111 (24) 103 (27)
RPE - Choroid 82.8 (1.4) 82.4(1.7) 104 (16) 101 (15)
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1753
N
G
G
Fig. 7. Thickness maps produced after segmentation. The top row shows the results for a
normal eye; the second and third row show the results for glaucomatous eyes. The first
column shows the en-face reconstruction, the second column shows the full retinal thick-
ness, the third column shows the RNFL thickness and the last column shows the thickness
as assessed by a GDx (dark, blue colors correspond to a thin RNFL and warm, red colors
correspond to a thick RNFL). Edges of local defects are indicated by red arrows.
this, the set of manually segmented data has to be extended to include the additional layers
and classifiers have to be trained on this data. Then, the new classifiers can simply be applied
to the data to obtain an automated segmentation of those layers. Likewise, the method may
be extended to other OCT devices. Once a set of manually labeled images acquired on such a
device is available, the classifiers may be trained on this data and subsequently applied to new
data sets originating from these machines.
The type of features and the type of classifier that was used was not optimized for each
interface separately. This paper illustrates how a single approach can be suitable for detecting
all interfaces, despite their different features. However, optimizing the choice of features and
classifiers per interface is likely to further improve the results. Additionally, the smoothness
may be optimized for each interface by adapting the weights
α
and
β
of the level set method in
Eq. (13). All of these refinements may also be used to incorporate additional prior knowledge.
For the intra-retinal interfaces, the errors in glaucomatous eyes were actually smaller than
those in normal eyes. Because the error on normal eyes was calculated by a leave-one-eye-out
method, the available training set when evaluating normal eyes contained less data (i.e., 18 B-
scans of 9 eyes) than the training set for evaluation of glaucomatous eyes (i.e., 20 B-scans of
10 eyes). This suggests that the estimated error for normal eyes may have been biased and that
the actual generalization error of the full training set is lower than the leave-one-out estimate
presented here. In addition, these results imply that increasing the pool of training data, i.e.,
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1754
manually segmenting B-scans from more normal eyes, will further improve the results for those
interfaces. This error behavior was not observed for the top and bottom interfaces of the retina,
which is consistent with the observation that those interfaces are easier to detect and therefore
less training data was required for accurate segmentation.
The root-mean-square error was, as expected, always larger than the mean absolute devi-
ation, because it includes both the variance and the bias of the error. However, the ratio of both
errors was not very constant. It varied by interface, which means that in some interfaces, the
relative variance in the error (i.e., the variance over the squared mean) was larger than in other
interfaces. This might have been caused by the automatic algorithm sometimes ‘locking’ to
the wrong boundary for those interfaces. Again, this was observed mostly for the intra-retinal
layers and adding more training data may improve this.
The running time of the algorithm was dominated by two parts: Pixel classification and inter-
face regularization. Pixel classification is a prime candidate for speed up by parallel computing.
Every A-line is processed independently and even a na¨
ıve parallelization will therefore result in
a very large speed-up. In addition, after the features are calculated from the A-lines, all pixels
on that A-line may be processed in parallel as well. Current graphics processing units (GPUs)
seem very suitable for this task. Parallelizing the level set method seems less straight-forward.
However, the level set method itself is not an integral part of the presented algorithm, but only
used to post-process the classification results. Therefore, any method that results in a regular-
ized interface suffices.
In this paper we introduced a segmentation method that learns from example data and may
therefore be used to segment various layers in OCT data. The method is not limited to a single
Fig. 8. Integrated OCT data just above the RPE, showing (shadows of) retinal vessels, for
anormal (left) and a glaucomatous (right) eye.
Fig. 9. Integrated OCT data just below the RPE, showing choroidal vasculature (and rem-
nants of retinal vessels), for a normal (left) and a glaucomatous (right) eye.
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1755
OCT device or scan protocol but can adapt to different input data by using new training data.
The algorithm first processes single A-lines which are then combined in a three-dimensional
regularization procedure, reducing scanning artifacts and resulting in interfaces that appear to
match known morphology.
To illustrate the use of the segmentation results, two applications were shown in this paper:
thickness maps and vessel projections. NFL thickness maps, such as the ones presented in Fig.
7, are a clinically very useful tool for assessment and monitoring of glaucoma. In a similar
way, other thickness maps, representing the thickness of other (combinations of) layers, may
be produced as well. In addition, the segmentation results may be used to selectively integrate
the OCT signal over specific layers, thus producing an image that shows the scattering prop-
erties of those single layers. This may provide additional information on (the progression of)
the state of a disease. Finally, the segmentation results may also be used for visualization of the
raw OCT data, e.g. by coloring each layer differently. In all of these cases, the described seg-
mentation method is an essential step in transforming a raw OCT scan into a clinically useful
representation of the data.
Acknowledgments
This work was sponsored in part by The Rotterdam Eye Hospital Research Foundation, Rotter-
dam, The Netherlands; Stichting Oogfonds Nederland, Utrecht, The Netherlands; Glaucoom-
fonds, Utrecht, The Netherlands; Landelijke Stichting voor Blinden en Slechtzienden, Utrecht,
The Netherlands; Stichting voor Ooglijders, Rotterdam, The Netherlands; Stichting Nederlands
Oogheelkundig Onderzoek, Nijmegen, The Netherlands.
#144515 - $15.00 USD
Received 27 Mar 2011; revised 13 May 2011; accepted 20 May 2011; pub. 27 May 2011
(C) 2011 OSA
1 June 2011 / Vol. 2, No. 6 / BIOMEDICAL OPTICS EXPRESS 1756
... Wiener and anisotropic diffusion (AD) filters for noise reduction Inherent speckle and additive noise have an adverse effect on SD-OCT images, lowering the image's quality. Drawing precise differences across boundaries is therefore extremely difficult and frequently inaccurate [29]. The Wiener filter and anisotropic diffusion (AD) Filter are employed in this situation to reduce various types of noise. ...
Article
Full-text available
A novel automated method for segmenting retinal layers in three‐dimensional (3D) space from spectral domain optical coherence tomography (SD‐OCT) images. Compared to 2D segmentation, 3D segmentation uses more data and produces findings that are more accurate and reliable. The class‐specific area of interest (ROI) choice and three important reference class approximations make the suggested technique precise, effective, and reliable. In the first step, contours are detected based on gradient intensity. To choose a smaller region of interest (ROI), the second stage entails acquiring the identified boundary neighbour B scan data for the selected ROI by categorising the problem as a graph problem. The third stage involves locating edge pixels using Canny Edge Detection from nodes. In order to calculate the edge weight of a histogram, slope similarity to the reference line and node characteristics are considered. The fourth phase boundary is precisely found by Dijkstra's shortest path algorithm. The accuracy of the method was tested based on 288 B scans of 12 patients (ten normal macular degeneration (AMD) subjects and 2 age‐related subjects from two different institutions). Five recent automated procedures are compared with the results to further validate the findings of the fifth phase. The outcomes demonstrate a mean original mean square error (RMSE) for each of the cut‐off values, which are 2.82, 4.88, 2.03, 3.77, and 0.64 pixels, respectively. As can be seen, the suggested strategy outperforms the existing models' significantly with a return on investment of 0.26 pixels.
... The search region's restriction and downsampling helped to lower the time complexity. SVM with characteristics dependent on picture intensities and gradients were utilized by Vermeer et al. [28] to identify five retinal interfaces. In order to determine the initial surfaces, Srinivasan et al. [29] coupled SVM with the average-intensity of every row. ...
Article
Full-text available
The majority of the retinal diseases have visual symptoms. Any area of the retina, a delicate layer of tissue on the interior the posterior wall of the human eye, can be impacted by retinal disorders. Optical coherence tomography (OCT) is the utmost commonly used imaging procedure for diagnosing retinal disorders such as age-related macular degeneration (ARMD), diabetic retinopathy, pigment epithelial detachment (PED), macular holes, and more. In this study, we put forth a brand-new technique for accurately extracting features from OCT images to identify PED diseases. For the preprocessing step, we examined the wiener filtering method. After that, we segmented the retinal pigment epithelium (RPE) layer used to the thresholding method, extracted the features from the RPE layer, and then gave the features to machine learning (ML) classifiers like the support vector machine (SVM), logistic regression (LR), k-nearest neighbors (KNN), decision tree (DT), random forest (RF), naive Bayes (NB), and artificial neural network (ANN). The total dataset about 200 images among 100 is normal and 100 is PED, we trained the dataset as an unbalanced and balanced group. The RF is the best outcome in comparison of other classifiers. The overall outcome of random forest is 100% accuracy.
... Three-dimensional (3D) scan of the retina is now widely used in commercial OCT devices. Existing volume based segmentation methods mainly use 3D graph based methods [22][23][24][25][26][27][28][29][30] and pattern recognition [31][32][33][34][35]. Benefiting from contextual information represented in the analysis graph, graph based methods provide optimal solutions and ideal for volumetric data processing. ...
Preprint
Full-text available
Optical coherence tomography (OCT) is a non-invasive imaging technique that can produce images of the eye at the microscopic level. OCT image segmentation to localise retinal layer boundaries is a fundamental procedure for diagnosing and monitoring the progression of retinal and optical nerve disorders. In this paper, we introduce a novel and accurate geodesic distance method (GDM) for OCT segmentation of both healthy and pathological images in either two- or three-dimensional spaces. The method uses a weighted geodesic distance by an exponential function, taking into account both horizontal and vertical intensity variations. The weighted geodesic distance is efficiently calculated from an Eikonal equation via the fast sweeping method. The segmentation is then realised by solving an ordinary differential equation with the geodesic distance. The results of the GDM are compared with manually segmented retinal layer boundaries/surfaces. Extensive experiments demonstrate that the proposed GDM is robust to complex retinal structures with large curvatures and irregularities and it outperforms the parametric active contour algorithm as well as the graph theoretic based approaches for delineating the retinal layers in both healthy and pathological images.
... Machine learning algorithms possess a certain level of adaptability, allowing them to adjust based on di®erent datasets and tasks. Machine learning methods include Support Vector Machines (SVM), 19 Random Forests, 20 and Neural Networks, 21 all of which have been proven to provide good processing results in noisy images. 22 Although machine learning is indeed an e±cient, convenient, and low-cost method for layering, for improved robustness, it often requires a large dataset for model training. ...
Article
Full-text available
Retinal diseases pose significant challenges to global healthcare systems, necessitating accurate and efficient diagnostic methods. Optical Coherence Tomography (OCT) has emerged as a valuable tool for diagnosing and monitoring retinal conditions due to its noncontact and noninvasive nature. This paper presents a novel retinal layering method based on OCT images, aimed at enhancing the accuracy of retinal lesion diagnosis. The method utilizes gradient analysis to effectively identify and segment retinal layers. By selecting a column of pixels as a segmentation line and utilizing gradient information from adjacent pixels, the method initiates and proceeds with the layering process. This approach addresses potential issues arising from partial layer overlapping, minimizing deviations in layer segmentation. Experimental results demonstrate the efficacy of the proposed method in accurately segmenting eight retinal boundaries, with an average absolute position deviation of 1.75 pixels. By providing accurate segmentation of retinal layers, this approach contributes to the early detection and management of ocular conditions, ultimately improving patient outcomes and reducing the global burden of vision-related ailments.
... Vermeer [18] in 2011 presented a method for three-dimensional retinal layer segmentation in OCT images by a flexible method that learned from provided examples. Parts of representative OCT scans were manually segmented and used by the algorithms to learn from. ...
Article
Alzheimer's disease (AD) is a chronic brain disorder that affects the brain cells finally. The common cause of AD is dementia which reduces memory, thinking, behavior, and social skills. All these changes affect a person's ability to function. It is challenging to detect the disease in the early stages. Some of the most common diagnosing techniques are Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), and Single Photon Emission Computed Tomography (SPECT). These techniques provide information regarding the external and internal regions of the brain activities for diagnosing AD. Sometimes with the above methods, it is impossible to analyze AD accurately. The retina is a significant part of the eye which provides the vision to humans. Several studies made that the retina reveals that AD patients have some variations in the retina layers in addition to brain changes. Therefore, the retina becomes a biomarker for diagnosing AD. There are different techniques available for an eye examination. Most noticeable are Fundus Imaging and Optical Coherence Tomography (OCT). This paper introduces multi-layered Deep Learning (MLDL) to diagnose AD in the early stages from retinal abnormalities. Results show that the proposed approach achieved 98.7% accuracy in detecting AD.
Chapter
Full-text available
Image de-noising is an essential field in image processing, encompassing a wide range of applications. This is a pre-processing task in which unwanted noise signals are removed using different techniques. Noise is unwanted signals that deteriorate the useful information from the image. This information may be edges, ridges, contours or other fine structures. For different applications these details are important. Noise signals may contaminate the image partially or completely. It depends upon the type of noise and its level. Noise may be categorised according to its characteristics. The most frequent types of noise signals encountered in image processing include additive white Gaussian noise, speckle noise, salt and pepper noise, Rician noise, random noise, and more. Noise signals introduced in the images during data acquisition, transmission or due to faulty location. Additive white Gaussian noise is one of the most common noise signals that affects almost all the images to a certain extent. In this chapter we apply a de-noising technique which is based on wavelet thresholding. A wavelet transform is widely recognized as one of the most popular transforms in signal and image processing. It is used in various image processing applications. Thresholding is an essential component in wavelet transforms, and it is commonly classified into two types: hard thresholding and soft thresholding. In this chapter we apply a soft thresholding technique that outperforms the hard thresholding technique.
Article
With the development of science and technology and the improvement of the quality of life, ophthalmic diseases have become one of the major disorders that affect the quality of life of people. In view of this, we propose a new method of ophthalmic disease classification, ED-Net (Eye Disease Classification Net), which is composed of the ED_Resnet model and ED_Xception model, and we compare our ED-Net method with classical classification algorithms, transformer algorithm, more advanced image classification algorithms and ophthalmic disease classification algorithms. We propose the ED_Resnet module and ED_Xception module and reconstruct these two modules into a new image classification algorithm ED-Net, and compared them with classical classification algorithms, transformer algorithms, more advanced image classification algorithms and eye disease classification algorithms.
Article
Full-text available
Kernel techniques have long been used in SVM to handle linearly inseparable problems by transforming data to a high dimensional space, but training and testing large data sets is often time consuming. In contrast, we can efficiently train and test much larger data sets using linear SVM without kernels. In this work, we apply fast linear-SVM methods to the explicit form of polynomially mapped data and investigate implementation issues. The approach enjoys fast training and testing, but may sometimes achieve accuracy close to that of using highly nonlinear kernels. Empirical experiments show that the proposed method is useful for certain large-scale data sets. We successfully apply the proposed method to a natural language processing (NLP) application by improving the testing accuracy under some training/testing speed requirements.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
With the introduction of spectral-domain optical coherence tomography (OCT), much larger image datasets are routinely acquired compared to what was possible using the previous generation of time-domain OCT. Thus, the need for 3-D segmentation methods for processing such data is becoming increasingly important. We report a graph-theoretic segmentation method for the simultaneous segmentation of multiple 3-D surfaces that is guaranteed to be optimal with respect to the cost function and that is directly applicable to the segmentation of 3-D spectral OCT image data. We present two extensions to the general layered graph segmentation method: the ability to incorporate varying feasibility constraints and the ability to incorporate true regional information. Appropriate feasibility constraints and cost functions were learned from a training set of 13 spectral-domain OCT images from 13 subjects. After training, our approach was tested on a test set of 28 images from 14 subjects. An overall mean unsigned border positioning error of 5.69 plusmn 2.41 mum was achieved when segmenting seven surfaces (six layers) and using the average of the manual tracings of two ophthalmologists as the reference standard. This result is very comparable to the measured interobserver variability of 5.71 plusmn 1.98 mum.