Conference PaperPDF Available

Two-step Multi-spectral Registration Via Key-point Detector and Gradient Similarity: Application to Agronomic Scenes for Proxy-sensing

Authors:

Abstract and Figures

The potential of multi-spectral images is growing rapidly in precision agriculture, and is currently based on the use of multi-sensor cameras. However, their development usually concerns aerial applications and their parameters are optimized for high altitudes acquisition by drone (UAV ≈ 50 meters) to ensure surface coverage and reduce technical problems. With the recent emergence of terrestrial robots (UGV), their use is diverted for nearby agronomic applications. Making it possible to explore new agronomic applications, maximizing specific traits extraction (spectral index, shape, texture …) which requires high spatial resolution. The problem with these cameras is that all sensors are not aligned and the manufacturers' methods are not suitable for close-field acquisition, resulting in offsets between spectral images and degrading the quality of extractable informations. We therefore need a solution to accurately align images in such condition. In this study we propose a two-steps method applied to the six-bands Airphen multi-sensor camera with (i) affine correction using pre-calibrated matrix at different heights, the closest transformation can be selected via internal GPS and (ii) perspective correction to refine the previous one, using key-points matching between enhanced gradients of each spectral bands. Nine types of key-point detection algorithms (ORB, GFTT, AGAST, FAST, AKAZE, KAZE, BRISK, SURF, MSER) with three different modalities of parameters were evaluated on their speed and performances, we also defined the best reference spectra on each of them. The results show that GFTT is the most suitable methods for key-point extraction using our enhanced gradients, and the best spectral reference was identified to be the band centered on 570 nm for this one. Without any treatment the initial error is about 62 px, with our method, the remaining residual error is less than 1 px, where the manufacturer's involves distortions and loss of information with an estimated residual error of approximately 12 px.
Content may be subject to copyright.
Two-step multi-spectral registration via key-point detector and gradient
similarity. Application to agronomic scenes for proxy-sensing
Jehan-Antoine VAYSSADE1a, Gawain Jones1b, Jean-Noel Paoli1cand Christelle Gee1d
1Agroécologie, AgroSup Dijon, INRA, Univ. Bourgogne-Franche-Comté, F-21000 Dijon, France
jehan-antoine.vayssade@inra.fr, {gawain.jones, jean-noel.paoli, christelle.gee}@agrosupdijon.fr
Keywords:
Registration, Multi-spectral imagery, Precision farming, Feature descriptor
Abstract:
The potential of multi-spectral images is growing rapidly in precision agriculture, and is currently
based on the use of multi-sensor cameras. However, their development usually concerns aerial
applications and their parameters are optimized for high altitudes acquisition by drone (UAV 50
meters) to ensure surface coverage and reduce technical problems. With the recent emergence
of terrestrial robots (UGV), their use is diverted for nearby agronomic applications. Making it
possible to explore new agronomic applications, maximizing specic traits extraction (spectral
index, shape, texture …) which requires high spatial resolution. The problem with these cameras
is that all sensors are not aligned and the manufacturers’ methods are not suitable for close-eld
acquisition, resulting in osets between spectral images and degrading the quality of extractable
informations. We therefore need a solution to accurately align images in such condition.
In this study we propose a two-steps method applied to the six-bands Airphen multi-sensor camera
with (i) ane correction using pre-calibrated matrix at dierent heights, the closest transformation
can be selected via internal GPS and (ii) perspective correction to rene the previous one, using
key-points matching between enhanced gradients of each spectral bands. Nine types of key-point
detection algorithms (ORB, GFTT, AGAST, FAST, AKAZE, KAZE, BRISK, SURF, MSER)
with three dierent modalities of parameters were evaluated on their speed and performances, we
also dened the best reference spectra on each of them. The results show that GFTT is the most
suitable methods for key-point extraction using our enhanced gradients, and the best spectral
reference was identied to be the band centered on 570 nm for this one. Without any treatment
the initial error is about 62 px, with our method, the remaining residual error is less than 1px,
where the manufacturer’s involves distortions and loss of information with an estimated residual
error of approximately 12 px.
1 INTRODUCTION
Modern agriculture is changing towards a
system that is less dependent on pesticides
[Lechenet et al., 2014] (herbicides remain the
most dicult pesticides to reduce) and digital
tools are of great help in his matter. The develop-
ment of imaging and image processing have made
it possible to characterize an agricultural plot
[Sankaran et al., 2015] (crop health status or soil
ahttps://orcid.org/0000-0002-7418-8347
bhttps://orcid.org/0000-0002-5492-9590
chttps://orcid.org/0000-0002-0499-9398
dhttps://orcid.org/0000-0001-9744-5433
characteristics) using non-destructive agronomic
indices [Jin et al., 2013] replacing traditional de-
structive and time-consuming methods. In recent
years, the arrival of miniaturized multi-spectral
and hyper-spectral cameras on Unmanned Aerial
Vehicles (UAVs) has allowed spatio-temporal
eld monitoring. These vision systems have been
developed for precise working conditions (ight
height 50 m). Although, very practical to use,
they are also used for proxy-sensing applications.
However, the algorithms oered by manufactur-
ers to co-register multiple single-band images
at dierent spectral range, are not optimal for
low heights. It thus requires a specic close-eld
image registration.
Image registration is the process of transform-
ing dierent images of one scene into the same
coordinate system. The spatial relationships
between these images can be rigid (translations
and rotations), ane (shears for example),
homographic, or complex large deformation
models (due to the dierence of depth between
ground and leafs) [Kamoun, 2019]. The main
diculty is that multi-spectral cameras have low
spectral coverage between bands, resulting in a
loss of characteristics between them. Which is
caused by (i) plant leaves have dierent aspect
depending on the spectral bands (ii) there are
highly complex and self-similar structures in our
images [Douarre et al., 2019]. It therefore aects
the process of detecting common characteristics
between bands for image registration. There
are two types of registration, feature based
and intensity based [Zitová and Flusser, 2003].
Feature based methods works by extracting point
of interest and use feature matching, in most
cases a brute-force matching is used, making
those techniques slow. Fortunately these features
can be ltered on the spatial properties to reduce
the matching cost. A GPGPU implementation
can also reduce the comparisons cost. Intensity-
based automatic image registration is an iterative
process, and the metrics used are sensitive to
determine the numbers of iteration, making such
method computationally expensive for precise
registration. Furthermore multi-spectral implies
dierent metrics for each registered bands which
is hard to achieve.
Dierent studies of images alignment using
multi-sensors camera can be found for acquisition
using UAV at medium (50 200 m) and high
(200 1000 m) distance. Some show good
performances (in term of number of key-points)
of feature based [Dantas Dias Junior et al., 2019,
Vakalopoulou and Karantzalos, 2014] with
strong enhancement of feature descriptor for
matching performances. Other prefer to use in-
tensity based registration [Douarre et al., 2019]
on better convergence metrics [Chen et al., 2018]
(in term of correlation), which is slower and
not necessarily robust against light variability
and their optimization can also fall into a local
minimum, resulting in a non-optimal registration
[Vioix, 2004].
Traditional approach to multi-spectral im-
age registration is to designate one channel
as the target channel and register all the
others on the selected one. Currently, only
[Dantas Dias Junior et al., 2019] show a method
for selecting the best reference, but there is no
study who as dened the best spectral reference
in agronomic scene. In all cases NIR (850 nm) or
middle range spectral reference are convention-
ally used without studying the others on precision
agriculture. In addition those studies mainly
propose features matching without large methods
comparison [Dantas Dias Junior et al., 2019](less
than 4) of their performance (time/precision),
without showing the importance of the spectral
reference and the interest of normalized gradients
transformation (like in Intensity-based methods).
However, despite the growing use of UGVs
and multi-spectal imaging, the domain is not
very well sourced, and no study has been found
under agricultural and external conditions in
near eld of view (less than 10 meter) for
multi-spectral registration.
Thus, this study propose a benchmark of pop-
ular feature extractors inside normalized gradi-
ents transformation and the best spectral refer-
ence was dened for each of them. Moreover a
pre-ane registration is used to lter the feature
matching, evaluated at dierent spatial resolu-
tions. So this study shows the importance of the
selection of the reference and the features extrac-
tor on normalized gradients in such registration.
2 MATERIAL AND METHOD
2.1 Material
2.1.1 Camera
The multi-spectral imagery is provided by
the six-band multi-spectral camera Airphen
developed by HiPhen. Airphen is a scientic
multi-spectral camera developed by agronomists
for agricultural applications. It can be embedded
in dierent types of platforms such as UAV,
phenotyping robots, etc.
Airphen is highly congurable (bands, elds
of view), lightweight and compact. The camera
was congured using interferential lter centered
at 450/570/675/710/730/850 nm with FWHM 1
1Full Width at Half Maximum
of 10 nm, the position of each band is referenced
on gure 1. The focal lens is 8 mm for all wave-
length. The raw resolution for each spectral band
is 1280 ×960 px with 12 bit of precision. Fi-
nally the camera also provides an internal GPS
antenna that can be used to get the distance from
the ground.
Figure 1: Disposition of each band on the Airphen
multi-sensors camera
2.1.2 Datasets
Two datasets were taken at dierent heights with
images of a chessboard (use for calibration) and
of an agronomic scene. We used a metallic gantry
for positioning the camera at dierent heights.
The size of the gantry is 3×5×4m. Due to the
size of the chessboard (57 ×57 cm with 14 ×14
square of 4cm), the limited focus of the camera
and the gantry height, we have bounded the
acquisition heights from 1.6to 5m with 20 cm
steps, which represents 18 acquisitions.
The rst dataset is for the calibration. A
chessboard is taken at dierent heights The sec-
ond one is for the alignment verication under
real conditions. One shot of an agronomic scene
is taken at dierent heights with a maximum bias
set at 10 cm.
2.2 Methods
Alignment is rened in two stages, with (i) ane
registration approximately estimated and (ii) per-
spective registration for the renement and preci-
sion. As example the gure 2shows each correc-
tion step, where the rst line is for the (i) ane
correction (section 2.2.1), the second is for (ii)
perspective correction. More precisely the second
step is per-channel pre-processed where feature
detectors are used to detect key-points (section
2.2.2). Each channel key-points are associated to
compute the perspective correction through ho-
mography, to the chosen spectral band (section
2.2.2). These steps are explained on specic sub-
sections.
Figure 2: Each step of the alignment procedure, with
(step 1) roughly corrected from ane correction and
(step 2) enhancement via key-points and perspective
2.2.1 Ane Correction
We make the assumption that closer we take the
snapshot, the bigger the distance between each
spectral band is. On the other hand, at a dis-
tance superior or equals to 5m, the initial ane
correction become stable. A calibration is used
to build a linear model based on that assump-
tion, which will allow the ane correction to work
at any height. The main purpose of this step is
to reduce the oset between each spectral band,
which allows the similarity between key-points to
be spatially delimited within a few pixels, making
feature matching more eective.
Calibration : Based on that previous assump-
tion a calibration is run over the chessboard
dataset. We detect the chessboard using the
opencv calibration toolbox [Bouguet, 2001] on
each spectral image (normalized by I= (I
min(I))/max(I)where I is the spectral image) at
dierent heights (from 1.6m to 5m). We use the
function ndChessboardCorners how attempts to
determine whether the input image is a view of
the chessboard pattern and locate the internal
chessboard corners. The detected coordinates are
roughly approximated. To determine their posi-
tions accurately we use the function cornerSubPix
as explained in the documentation 2.
2https://docs.opencv.org/2.4/modules/calib3d/
doc/camera_calibration_and_3d_reconstruction.
html
Linear model : Using all the points detected for
each spectral band, we calculate the centroid grid
(each point average). The ane transform from
each spectral band to this centroid grid is esti-
mated. Theoretically, the rotation and the scale
(A,B,C,D) do not depend on the distance to the
ground, but the translation (X,Y) does. Thus
a Levenberg-Marquardt curve tting algorithm
with linear least squares regression [Moré, 1978]
can be used to t an equation for each spec-
tral band against Xand Yindependently to the
centroid grid. We adjust the following curve
t=αh3+βh2+θh+γwhere his the height, tis
the resulted translation and factors α,β,θ,γare
the model parameters.
Correction : Based on the model estimated on
the chessboard dataset, we transpose them to the
agronomic dataset. To make the ane matrix
correction, we used the rotation and scale fac-
tors at the most accurate height (1.6m where
the spatial resolution of the chessboard is higher),
because it does not theoretically depend on the
height. For the translation part, the curve model
is applied for each spectral band at the given
height provided by the user. Each spectral band
is warped using the corresponding ane transfor-
mation. Finally, all spectral bands are cropped
to the minimal bounding box (minimal and max-
imal translation of each ane matrix). This rst
correction is an approximation. It provides some
spatial properties that we will use on the second
stage.
2.2.2 Perspective correction
Each spectral band has dierent properties and
values by nature but we can extract the corre-
sponding similarity by transforming each spectral
band into its absolute derivative, to nd similari-
ties in gradient break among them. As we can see
in gure 3gradients can have opposite direction
depending on the spectral bands, making the ab-
solute derivative an important step for matching
between dierent spectral band.
Figure 3: Gradient orientation in spectral band
[Rabatel and Labbe, 2016]. Orientation of the gra-
dient is not the same depending to the spectral band.
The ane correction attempts to help the fea-
ture matching by adding properties of epipolar
lines (close). Thus, the matching of extracted fea-
tures can be spatially bounded, (i) we know that
the maximum translation is limited to a distance
of a few pixels (less than 10px thanks to ane
correction), and (ii) the angle between the initial
and the matched one is limited to [1,1]degree.
Computing the gradient : To compute the gra-
dient of the image with a minimal impact of
the light distribution (shadow, reectance, spec-
ular, ...), each spectral band is normalized using
Gaussian blur [Sage and Unser, 2003], the ker-
nel size is dened by next_odd(image_width0.4)
(19 in our case) and the nal normalized im-
ages are dened by I/(G+1)255 where Iis
the spectral band and Gis the Gaussian blur
of those spectral bands. This rst step mini-
mizes the impact of the noise on the gradient and
smooth the signal in case of high reectance. Us-
ing this normalized image, the gradient Igrad(x,y)
is computed with the sum of absolute Sharr l-
ter [Seitz, 2010] for horizontal Sxand vertical Sy
derivative, noted Igrad (x,y) = 1
2|Sx|+1
2|Sy|. Fi-
nally, all gradients Igrad (x,y)are normalized us-
ing CLAHE [Zuiderveld, 1994] to locally improve
their intensity and increase the number of key-
points detected.
Key-points Extractor : A key-point is a point
of interest. It denes what is important and dis-
tinctive in an image. Dierent types of key-point
extractors are available and the following are
tested :
(ORB) Oriented FAST and Rotated BRIEF
[Rublee et al., 2011], (AKAZE) Fast explicit
diusion for accelerated features in nonlinear
scale spaces [Alcantarilla and Solutions, 2011],
(KAZE) A novel multi-scale 2D feature de-
tection and description algorithm in nonlinear
scale spaces [Ordonez et al., 2018], (BRISK)
Binary robust invariant scalable key-points
[Leutenegger et al., 2011], (AGAST) Adaptive
and generic corner detection based on the
accelerated segment test [Mair et al., 2010],
(MSER) maximally stable extremal re-
gions [Donoser and Bischof, 2006], (SURF)
Speed-Up Robust Features [Bay et al., 2006],
(FAST) FAST Algorithm for Corner Detection
[Trajković and Hedley, 1998] and (GFTT) Good
Features To Track [Shi et al., 1994].
These algorithms are largely described across
multiple studies [Dantas Dias Junior et al., 2019,
Tareen and Saleem, 2018,Zhang et al., 2016,
Ali et al., 2016], they are all available and easily
usable in OpenCV. Thus we have studied them
by varying the most inuential parameters for
each of them with three modalities, the table 1
in appendix shows all modalities.
Table 1: list of algorithms with 3 modalities of their
parameters
ABRV parameters modality 1 modality 2 modality 3
ORB nfeatures 5000 10000 15000
GFTT maxCorners 5000 10000 15000
AGAST threshold 71 92 163
FAST threshold 71 92 163
AKAZE nOctaves, nOctaveLayers (1, 1) (2, 1) (2, 2)
KAZE nOctaves, nOctaveLayers (4, 2) (4, 4) (2, 4)
BRISK nOctaves, patternScale (0, 0.1) (1, 0.1) (2, 0.1)
SURF nOctaves, nOctaveLayers (1, 1) (2, 1) (2, 2)
MSER None None None None
Key-point detection : We use one of the key-
point extractors mentioned above between each
spectral band gradients (all extractors are eval-
uated). For each detected key-point, we extract
a descriptor using ORB features. We match all
detected key-points to a reference spectral band
(all bands are evaluated). All matches are l-
tered by distance, position and angle, to elimi-
nate a majority of false positives along the epipo-
lar line. Finally we use the function ndHomog-
raphy between the key-points detected/ltered
with RANSAC [Fischler and Bolles, 1981], to de-
termine the best subset of matches to calculate
the perspective correction.
Correction : The perspective correction be-
tween each spectral band to the reference is es-
timated and applied. Finally, all spectral bands
are cropped to the minimum bounding box, which
is obtained by applying a perspective transforma-
tion to each corner of the image.
3 RESULTS AND DISCUSSION
Firstly the results will focus on ane correc-
tions and then on the eects of the perspective
correction. Figure 4shows a closeup inside at
1.6 m (4a) raw images acquisition, (4c &4d) reg-
istred image of each correction steps and (4b) the
manufacturer results.
(a) raw image (b) manufacturer’s
(c) roughly corrected (d) fully corrected
Figure 4: Example of each correction and the manu-
facturers results
3.1 Ane correction
The ane correction model is based on the cal-
ibration dataset (where the chessboard are ac-
quired). The 6 coecients (A,B,C,D,X,Y) of the
ane matrix were studied according to the height
of the camera in order to see their stability. It ap-
pears that the translation part (X,Y), depends on
the distance to the eld (appendix gure 5) ac-
cording to the initial assumption. On this part
the linear model is used to estimate the ane
correction from an approximated height.
Figure 5: Ane matrix value by height
Rotation and scale do not depend on the
ground distance (gure 6) according to the the-
ory. These factors (A,B,C,D) are quite stable and
close to identity, as expected (accuracy depends
on the spatial resolution of the board). As result,
single calibration can be used for this part of the
matrix, and the most accurate are used (i.e where
the chessboard has the best spatial resolution).
Figure 6: Ane matrix value by height
After the ane correction, the remaining
residual distances have been extracted, it is com-
puted using the detected, ltered and matched
key-point to the reference spectral band, gure 9
(up) shows an example using 570 nm as reference
before the perspective correction. The remaining
distance between each spectral band to the refer-
ence varies according to the distance between the
real height and the nearest selected (through lin-
ear model). Remember that a bias of +/- 10cm
was initially set to show the error in the worst
case, so the dierence of errors between each of
them are due to the dierence of sensors position
in the array to the reference and the provided
approximate height.
3.2 Perspective correction
The gures 7shows the numbers of key-points
after ltering and homographic association (min-
imum of all matches) as well as the computa-
tion time and performance ratio (matches/time)
for each method. The performance ratio is used
to compare methods between them, bigger he is,
greater is the method (balanced between time and
accuracy), making lower of them unsuitable.
Figure 7: features extractor performances after lter-
ing and homography association
All these methods oer interesting results, the
choice of method depends on application needs
between computation time and accuracy, three
methods stand out in all of there modality:
GFTT shows the best overall performance
in both computation time and number of
matches
• FAST and AGAST1 are quite suitable too,
with acceptable computation time and greater
matches performances.
The other ones did not show improvement in
term of time or matches (especially compared
to GFTT), some of them show a small number
of matches which can be too small to ensure
the precision. Increasing the number of key-
points matched allows a slightly higher accuracy
[Dantas Dias Junior et al., 2019]. For example,
switching from SURF (30 results) to FAST (130
results) reduces the nal residual distances from
1.2to 0.9px but increases the calculation
time from 5to 8seconds.
All methods show that the best spectral band
is 710 nm (in red), with an exception for SURF
and GFTT which is 570 nm. The gure 8shows
the minimum number of matches between each
reference spectrum and all the others, for each rel-
evant methods and modalities (KAZE, AGAST,
FAST GFTT). Choosing the right spectral ref-
erence is important, as we can see, no correspon-
dence is found in some cases between 675-850 nm,
but correspondences are found between 675-710
nm and 710-850 nm, making the 710 nm more
appropriate, the same behavior can be observed
for the other bands and 570 nm as the more ap-
propriate one. This is visible on the gure for all
methods, 570 nm and 710 nm have the best min-
imum number of matches where all the other are
quite small.
Figure 8: key-point extractor performances
Residues of the perspective correction show
that we have correctly registered each spectral
band, the gure 9(down) shows the residual
distance at dierent ground distances. In com-
parison the ane correction error are between
[1.04.8]px where the combination of ane
and perspective correction the residual error are
between [0.71.0]px. On average the per-
spective correction enhance the residual error by
(3.50.9)/3.574%.
Figure 9: (up) The mean distance of detected key-
point before perspective correction with 570 nm as
spectral reference (down) Perspective re-projection
error with GFTT using the rst modality and 570
nm as reference
3.3 General discussion
Even if the relief of the scene is not taken into
account due to the used deformation model, in
our case, with at ground, no dierence arise.
However, more complex deformation models
[Lombaert et al., 2012,Bookstein, 1989] could be
used to improve the remaining error. But could
also, in some case, create large angular defor-
mations caused by the proximity of key-points,
of course, it’s possible to lter these key-points,
which would also reduce the overall accuracy.
Further research can be performed on each pa-
rameter of the feature extractors, for those who
need specic performance (time/precision), we in-
vite anyone to download the dataset and test var-
ious combinations. Otherwise feature matching
can be optimized, at this stage, we use brute-
force matching with post ltering, but a dierent
implementation that fulll our spatial properties
should greatly enhance the number of matches by
reducing false positives.
4 CONCLUSION
In this work, the application of dierent
techniques for multi-spectral image registration
was explored using the Airphen camera. We
have tested nine type of key-points extractor
(ORB, GFTT, AGAST, FAST, AKAZE, KAZE,
BRISK, SURF, MSER) at dierent heights and
the number of control points obtained. As seen
in the method, the most suitable method is the
GFTT (regardless of modalities 1, 2 or 3) with
a signicant number of matches 150 450 and
a reasonable calculation time 1.17 s to 3.55 s
depending on the modality.
Furthermore, the best spectral reference was
dened for each method, for example 570 nm for
GFTT. We have observed a residual error of less
than 1 px, supposedly caused by the dierence of
sensors nature (spectral range, lens).
ACKNOWLEDGMENTS
The authors acknowledge support from
European Union through the project H2020
IWMPRAISE 3(Integrated Weed Management:
PRActical Implementation and Solutions for Eu-
rope) and from ANR Challenge ROSE through
the project ROSEAU 4(RObotics SEnsorimotor
loops to weed AUtonomously).
We would like to thanks Combaluzier Quentin,
Michon Nicolas, Savi Romain and Masson Jean-
Benoit for the realization of the metallic gantry
that help us positioning the camera at dierent
heights.
REFERENCES
[Alcantarilla and Solutions, 2011] Alcantarilla, P. F.
and Solutions, T. (2011). Fast explicit diusion
for accelerated features in nonlinear scale spaces.
IEEE Trans. Patt. Anal. Mach. Intell, 34(7):1281–
1298.
[Ali et al., 2016] Ali, F., Khan, S. U., Mahmudi,
M. Z., and Ullah, R. (2016). A comparison of fast,
surf, eigen, harris, and mser features. International
Journal of Computer Engineering and Information
Technology, 8(6):100.
3https://iwmpraise.eu/
4http://challenge-rose.fr/en/projet/roseau-2/
[Bannari et al., 1995] Bannari, A., Morin, D., Bonn,
F., and Huete, A. R. (1995). A review of vegetation
indices. Remote Sensing Reviews, 13(1-2):95–120.
[Bay et al., 2006] Bay, H., Tuytelaars, T., and
Van Gool, L. (2006). Surf: Speeded up robust fea-
tures. In European conference on computer vision,
pages 404–417. Springer.
[Bookstein, 1989] Bookstein, F. L. (1989). Principal
warps: thin-plate splines and the decomposition
of deformations. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 11(6):567–585.
[Bouguet, 2001] Bouguet, J.-Y. (2001). Camera cal-
ibration toolbox for matlab.
[Chen et al., 2018] Chen, S., Shen, H., Li, C., and
Xin, J. H. (2018). Normalized total gradient:
A new measure for multispectral image registra-
tion. IEEE Transactions on Image Processing,
27(3):1297–1310.
[Dantas Dias Junior et al., 2019] Dantas Dias Ju-
nior, J., Backes, A., and Escarpinati, M. (2019).
Detection of control points for uav-multispectral
sensed data registration through the combining of
feature descriptors. pages 444–451.
[Donoser and Bischof, 2006] Donoser, M. and
Bischof, H. (2006). Ecient maximally stable
extremal region (mser) tracking. In 2006 IEEE
Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR’06), volume 1,
pages 553–560. Ieee.
[Douarre et al., 2019] Douarre, C., Crispim-Junior,
C. F., Gelibert, A., Tougne, L., and Rousseau, D.
(2019). A strategy for multimodal canopy images
registration. In 7th International Workshop on Im-
age Analysis Methods in the Plant Sciences, Lyon,
France.
[Filella et al., 1995] Filella, I., Serrano, L., Serra, J.,
and Penuelas, J. (1995). Evaluating wheat nitrogen
status with canopy reectance indices and discrim-
inant analysis. Crop Science, 35(5):1400–1405.
[Fischler and Bolles, 1981] Fischler, M. A. and
Bolles, R. C. (1981). Random sample consensus:
A paradigm for model tting with applications
to image analysis and automated cartography.
Commun. ACM, 24(6):381–395.
[Jin et al., 2013] Jin, X.-l., Diao, W.-y., Xiao, C.-h.,
Wang, F.-y., Chen, B., Wang, K.-r., and Li, S.-k.
(2013). Estimation of wheat agronomic parameters
using new spectral indices. PLOS ONE, 8(8):1–9.
[Kamoun, 2019] Kamoun, E. (2019). Image registra-
tion: From sift to deep learning.
[Lechenet et al., 2014] Lechenet, M., Bretagnolle,
V., Bockstaller, C., Boissinot, F., Petit, M.-S., Pe-
tit, S., and Munier-Jolain, N. M. (2014). Recon-
ciling pesticide reduction with economic and envi-
ronmental sustainability in arable farming. PLOS
ONE, 9(6):1–10.
[Leutenegger et al., 2011] Leutenegger, S., Chli, M.,
and Siegwart, R. (2011). Brisk: Binary robust in-
variant scalable keypoints. In 2011 IEEE inter-
national conference on computer vision (ICCV),
pages 2548–2555. Ieee.
[Lombaert et al., 2012] Lombaert, H., Grady, L.,
Pennec, X., Ayache, N., and Cheriet, F. (2012).
Spectral demons – image registration via global
spectral correspondence. In Fitzgibbon, A., Lazeb-
nik, S., Perona, P., Sato, Y., and Schmid, C., edi-
tors, Computer Vision – ECCV 2012, pages 30–44,
Berlin, Heidelberg. Springer Berlin Heidelberg.
[Mair et al., 2010] Mair, E., Hager, G. D., Burschka,
D., Suppa, M., and Hirzinger, G. (2010). Adaptive
and generic corner detection based on the accel-
erated segment test. In European conference on
Computer vision, pages 183–196. Springer.
[Moré, 1978] Moré, J. J. (1978). The levenberg-
marquardt algorithm: Implementation and theory.
In Watson, G., editor, Numerical Analysis, volume
630 of Lecture Notes in Mathematics, pages 105–
116. Springer Berlin Heidelberg.
[Ordonez et al., 2018] Ordonez, A., Arguello, F., and
Heras, D. B. (2018). Alignment of hyperspectral
images using kaze features. Remote Sensing, 10(5).
[Rabatel and Labbe, 2016] Rabatel, G. and Labbe,
S. (2016). Registration of visible and near in-
frared unmanned aerialvehicle images based on
Fourier-Mellin transform. Precision Agriculture,
17(5):564–587.
[Rublee et al., 2011] Rublee, E., Rabaud, V., Kono-
lige, K., and Bradski, G. (2011). Orb: An e-
cient alternative to sift or surf. In Proceedings
of the 2011 International Conference on Computer
Vision, ICCV ’11, pages 2564–2571, Washington,
DC, USA. IEEE Computer Society.
[Sage and Unser, 2003] Sage, D. and Unser, M.
(2003). Teaching image-processing programming in
java. IEEE Signal Processing Magazine, 20(6):43–
52. Using “Student-Friendly” ImageJ as a Peda-
gogical Tool.
[Sankaran et al., 2015] Sankaran, S., Khot, L. R.,
Espinoza, C. Z., Jarolmasjed, S., Sathuvalli, V. R.,
Vandemark, G. J., Miklas, P. N., Carter, A. H.,
Pumphrey, M. O., Knowles, N. R., and Pavek,
M. J. (2015). Low-altitude, high-resolution aerial
imaging systems for row and eld crop phenotyp-
ing: A review. European Journal of Agronomy,
70:112 – 123.
[Seitz, 2010] Seitz, H. (2010). Contributions to the
minimum linear arrangement problem.
[Shi et al., 1994] Shi, J. et al. (1994). Good features
to track. In 1994 Proceedings of IEEE conference
on computer vision and pattern recognition, pages
593–600. IEEE.
[Tareen and Saleem, 2018] Tareen, S. A. K. and
Saleem, Z. (2018). A comparative analysis of sift,
surf, kaze, akaze, orb, and brisk. 2018 International
Conference on Computing, Mathematics and Engi-
neering Technologies (iCoMET), pages 1–10.
[Trajković and Hedley, 1998] Trajković, M. and Hed-
ley, M. (1998). Fast corner detection. Image and
vision computing, 16(2):75–87.
[Vakalopoulou and Karantzalos, 2014]
Vakalopoulou, M. and Karantzalos, K. (2014).
Automatic descriptor-based co-registration of
frame hyperspectral data. Remote Sensing, 6.
[Vioix, 2004] Vioix, J.-B. (2004). Conception et réali-
sation d’un dispositif d’imagerie multispectrale em-
barqué : du capteur aux traitements pour la détec-
tion d’adventices.
[Zhang et al., 2016] Zhang, H., Wohlfeil, J., and
Grießbach, D. (2016). Extension and evaluation
of the agast feature detector.
[Zitová and Flusser, 2003] Zitová, B. and Flusser, J.
(2003). Image registration methods: A survey. Im-
age and Vision Computing, 21:977–1000.
[Zuiderveld, 1994] Zuiderveld, K. (1994). Contrast
limited adaptive histogram equalization. In Graph-
ics gems IV, pages 474–485. Academic Press Pro-
fessional, Inc.
... Data pre-processing 105 Images registration. Due to the nature of the camera (figure 1), a spectral band registration is required and performed with a registration method based on previous work(Vayssade et al., 2020) (with a subpixel registration accuracy). The alignment is refined in two steps, with (i) a rough estimation of the affine correction and (ii) a perspective correction for the refinement and accuracy through the detection and matching of key points. ...
Preprint
Remote sensing indices have a wide range of applications in the field of earth observation. The form of a remote sensing index is generally empirically defined, whether by choosing specific reflectance bands, equation forms or its coefficients. These spectral indices are used as preprocessing stage before object detection/classification. But no study seems to search for the best form through function approximation in order to optimize the classification and/or segmentation. The objective of this study is to develop a method to find the optimal index, using a statistical approach by gradient descent on different forms of generic equations. From six wavebands images, five equations have been tested, such as : linear, linear ratio, polynomial, universal function approximator and dense morphological. Few techniques in signal processing and image analysis are also deployed within a deep-learning framework. Performances of standard indices and DeepIndices were evaluated using two metrics, the dice and the mean intersection over union (mIoU) scores. The study focuses on a specific multispectral camera used in near-field acquisition of soil and vegetation surfaces. These DeepIndices are built and compared to 89 common vegetation indices using the same vegetation dataset and metrics. As an illustration, the most used index for vegetation, NDVI (Normalized Difference Vegetation Indices) offers a mIoU score of 63.98% whereas our best models gives an analytic solution to reconstruct an index with a mIoU of 82.19%. This difference is significant enough to improve the segmentation and robustness of the index from various external factors, as well as the shape of detected elements.
... As illustration the Figure 2 shows a false color reconstruction of corn crop in the field with various weeds and shadows on the corners of the image (not vignetting). Due to the nature of the camera (Figure 1), a spectral band registration is required and performed with a registration method based on previous work [34] (with a sub-pixel registration accuracy). The alignment is refined in two steps, with (i) a rough estimation of the affine correction and (ii) a perspective correction for the refinement and accuracy through the detection and matching of key points. ...
Article
Full-text available
The form of a remote sensing index is generally empirically defined, whether by choosing specific reflectance bands, equation forms or its coefficients. These spectral indices are used as preprocessing stage before object detection/classification. But no study seems to search for the best form through function approximation in order to optimize the classification and/or segmentation. The objective of this study is to develop a method to find the optimal index, using a statistical approach by gradient descent on different forms of generic equations. From six wavebands images, five equations have been tested, namely: linear, linear ratio, polynomial, universal function approximator and dense morphological. Few techniques in signal processing and image analysis are also deployed within a deep-learning framework. Performances of standard indices and DeepIndices were evaluated using two metrics, the dice (similar to f1-score) and the mean intersection over union (mIoU) scores. The study focuses on a specific multispectral camera used in near-field acquisition of soil and vegetation surfaces. These DeepIndices are built and compared to 89 common vegetation indices using the same vegetation dataset and metrics. As an illustration the most used index for vegetation, NDVI (Normalized Difference Vegetation Indices) offers a mIoU score of 63.98% whereas our best models gives an analytic solution to reconstruct an index with a mIoU of 82.19%. This difference is significant enough to improve the segmentation and robustness of the index from various external factors, as well as the shape of detected elements.
Article
Detecting and identifying plants using image analysis is a key step for many applications in precision agriculture (from phenotyping to site specific weed management). Instance segmentation is usually carried on to detect entire plants. However, the shape of the detected objects changes between individuals and growth stages. A relevant approach to reduce these variations is to narrow the detection on the leaf. Nevertheless, segmenting leaves is a difficult task, when images contain mixes of plant species, and when individuals overlap, particularly in an uncontrolled outdoor environment. To leverage this issue, this study based on recent Convolutional Neural Network mechanisms, proposes a pixelwise instance segmentation to detect leaves in dense foliage environment. It combines “deep contour aware” (to separate the inner of big leaves from its edges), “Leaf Segmentation trough classification of edges” (to separate instances with a specific inner edges) and “Pyramid CNN for Dense Leaves” (to consider edges at different scales). But the segmentation output is also refined using a Watershed and a method to compute optimized vegetation indices (DeepIndices). The method is compared to others running the leaf segmentation challenge (provided by the International Network on Plant Phenotyping) and applied on an external dataset of Komatsuna plants. In addition, a new multispectral dataset of 300 images of bean plants is introduced (with dense foliage, individuals overlapping, mixes of species and natural lighting conditions). The ground truth (e.g. the leaves boundaries) is defined by labelled polygons and can be used to train and assess the performance of various algorithms dedicated to leaf detection or crop/weed classification. On the usual datasets, the performances of the proposed method are similar to those of the usual methods involved in the leaf segmentation challenges. On the new dataset, their results are strongly better than those of the usual RCNN method. Remaining errors are bad fusion between neighboring areas and over segmentation of multi-foliate leaves. Structural analysis methods could be studied in order to overcome these deficiencies.
Article
Full-text available
Image registration is a common operation in any type of image processing, specially in remote sensing images. Since the publication of the scale–invariant feature transform (SIFT) method, several algorithms based on feature detection have been proposed. In particular, KAZE builds the scale space using a nonlinear diffusion filter instead of Gaussian filters. Nonlinear diffusion filtering allows applying a controlled blur while the important structures of the image are preserved. Hyperspectral images contain a large amount of spatial and spectral information that can be used to perform a more accurate registration. This article presents HSI–KAZE, a method to register hyperspectral remote sensing images based on KAZE but considering the spectral information. The~proposed method combines the information of a set of preselected bands, and it adapts the keypoint descriptor and the matching stage to take into account the spectral information. The method is adequate to register images in extreme situations in which the scale between them is very different. The effectiveness of the proposed algorithm has been tested on real images taken on different dates, and presenting different types of changes. The experimental results show that the method is robust achieving image registrations with scales of up to 24.0×.
Article
Full-text available
Vision-aided inertial navigation is a navigation method which combines inertial navigation with computer vision techniques. It can provide a six degrees of freedom navigation solution from passive measurements without external referencing (e.g. GPS). Thus, it can operate in unknown environments without any prior knowledge. Such a system, called IPS (Integrated Positioning System) is developed by the German Aerospace Center (DLR). For optical navigation applications, a reliable and efficient feature detector is a crucial component. With the publication of AGAST, a new feature detector has been presented, which is faster than other feature detectors. To apply AGAST to optical navigation applications, we propose several methods to improve its performance. Based on a new non-maximum suppression algorithm, automatic threshold adaption algorithm in combination with an image split method, the optimized AGAST provides higher reliability and efficiency than the original implementation using the Kanade Lucas Tomasi (KLT) feature detector. Finally, we compare the performance of the optimized AGAST with the KLT feature detector in the context of IPS. The presented approach is tested using real data from typical indoor scenes, evaluated on the accuracy of the navigation solution. The comparison demonstrates a significant performance improvement achieved by the optimized AGAST.
Article
Full-text available
The combination of aerial images acquired in the visible and near infrared spectral ranges is particularly relevant for agricultural and environmental survey. In unmanned aerial vehicle imagery, such a combination can be achieved using a set of several embedded cameras mounted close to each other, followed by an image registration step. However, due to the different nature of source images, usual registration techniques based on feature point matching are limited when dealing with blended vegetation and bare soil patterns. Here, another approach is proposed based on image spatial frequency analysis. This approach, which relies on the Fourier-Mellin transform, has been adapted to homographic registration and distortion issues. It has been successfully tested on various aerial image sets, and has proved to be particularly robust and accurate, providing a registration error below 0.3 pixels in most cases.
Article
Full-text available
Frame hyperspectral sensors, in contrast to push-broom or line-scanning ones, produce hyperspectral datasets with, in general, better geometry but with unregistered spectral bands. Being acquired at different instances and due to platform motion and movements (UAVs, aircrafts, etc.), every spectral band is displaced and acquired with a different geometry. The automatic and accurate registration of hyperspectral datasets from frame sensors remains a challenge. Powerful local feature descriptors when computed over the spectrum fail to extract enough correspondences and successfully complete the registration procedure. To this end, we propose a generic and automated framework which decomposes the problem and enables the efficient computation of a sufficient amount of accurate correspondences over the given spectrum, without using any ancillary data (e.g., from GPS/IMU). First, the spectral bands are divided in spectral groups according to their wavelength. The spectral borders of each group are not strict and their formulation allows certain overlaps. The spectral variance and proximity determine the applicability of every spectral band to act as a reference during the registration procedure. The proposed decomposition allows the descriptor and the robust estimation process to deliver numerous inliers. The search space of possible solutions has been effectively narrowed by sorting and selecting the optimal spectral bands which under an unsupervised manner can quickly recover hypercube's geometry. The developed approach has been qualitatively and quantitatively evaluated with six different datasets obtained by frame sensors onboard aerial platforms and UAVs. Experimental results appear promising.
Article
Full-text available
Reducing pesticide use is one of the high-priority targets in the quest for a sustainable agriculture. Until now, most studies dealing with pesticide use reduction have compared a limited number of experimental prototypes. Here we assessed the sustainability of 48 arable cropping systems from two major agricultural regions of France, including conventional, integrated and organic systems, with a wide range of pesticide use intensities and management (crop rotation, soil tillage, cultivars, fertilization, etc.). We assessed cropping system sustainability using a set of economic, environmental and social indicators. We failed to detect any positive correlation between pesticide use intensity and both productivity (when organic farms were excluded) and profitability. In addition, there was no relationship between pesticide use and workload. We found that crop rotation diversity was higher in cropping systems with low pesticide use, which would support the important role of crop rotation diversity in integrated and organic strategies. In comparison to conventional systems, integrated strategies showed a decrease in the use of both pesticides and nitrogen fertilizers, they consumed less energy and were frequently more energy efficient. Integrated systems therefore appeared as the best compromise in sustainability trade-offs. Our results could be used to re-design current cropping systems, by promoting diversified crop rotations and the combination of a wide range of available techniques contributing to pest management.
Article
Full-text available
Crop agronomic parameters (leaf area index (LAI), nitrogen (N) uptake, total chlorophyll (Chl) content ) are very important for the prediction of crop growth. The objective of this experiment was to investigate whether the wheat LAI, N uptake, and total Chl content could be accurately predicted using spectral indices collected at different stages of wheat growth. Firstly, the product of the optimized soil-adjusted vegetation index and wheat biomass dry weight (OSAVI×BDW) were used to estimate LAI, N uptake, and total Chl content; secondly, BDW was replaced by spectral indices to establish new spectral indices (OSAVI×OSAVI, OSAVI×SIPI, OSAVI×CIred edge, OSAVI×CIgreen mode and OSAVI×EVI2); finally, we used the new spectral indices for estimating LAI, N uptake, and total Chl content. The results showed that the new spectral indices could be used to accurately estimate LAI, N uptake, and total Chl content. The highest R(2) and the lowest RMSEs were 0.711 and 0.78 (OSAVI×EVI2), 0.785 and 3.98 g/m(2) (OSAVI×CIred edge) and 0.846 and 0.65 g/m(2) (OSAVI×CIred edge) for LAI, nitrogen uptake and total Chl content, respectively. The new spectral indices performed better than the OSAVI alone, and the problems of a lack of sensitivity at earlier growth stages and saturation at later growth stages, which are typically associated with the OSAVI, were improved. The overall results indicated that this new spectral indices provided the best approximation for the estimation of agronomic indices for all growth stages of wheat.
Article
The decomposition of deformations by principal warps is demonstrated. The method is extended to deal with curving edges between landmarks. This formulation is related to other applications of splines current in computer vision. How they might aid in the extraction of features for analysis, comparison, and diagnosis of biological and medical images is indicated.