Content uploaded by Loïca Avanthey
Author content
All content in this area was uploaded by Loïca Avanthey on May 12, 2021
Content may be subject to copyright.
UNDERWATER FIELD EQUIPMENT OF A NETWORK OF LANDMARKS OPTIMIZED FOR
AUTOMATIC DETECTION BY AI
L. Beaudoin, L. Avanthey
SEAL Research Team (Sense, Explore, Analyse and Learn),
´
EPITA, 14-16 rue Voltaire, 94270 Le Kremlin-Bicˆ
etre, France,
´
Equipe Acquisition et Traitement, IGN, 73 avenue de Paris, 94165 Saint Mand´
e, France
ABSTRACT
To qualify the point clouds obtained by 3D reconstruction
of a global study area in close-range remote sensing, control
points, whose position has been measured essentially manu-
ally in the field with an instrument whose precision is known,
are used. In the underwater environment, equipping the field
and carrying out these measurements is a complex operation
to perform due to the peculiarities of the environment. We
present in this article a first step towards the automation of
this task, the automatic detection of targets by a deep learning
algorithm which will serve to correctly position the control
points locally, and a simplification of the manual measure-
ment which will serve in future work to control the results of
automatic readings.
Index Terms—GCP network, underwater artificial land-
marks, detection and identification by deep-learning, quick
manual in situ measurements
1. INTRODUCTION
In photogrammetry, we qualify the result of a 3D reconstruc-
tion by comparing the model obtained with in situ measure-
ments on ground control points (GCP). This approach is valid
whether in an aerial or underwater environment. The first ex-
periments took place in the aerial world. Bringing these tech-
niques directly to the underwater world is not possible be-
cause of the specificities of this environment (lower visibility,
diffusion and diffraction phenomena, scene dynamism ...).
The measurements of the network of GCP are made man-
ually. However, the operational constraints are much stronger
in the underwater environment because of the limited access
time to the site by the divers, and the difficulty to have an ab-
solute (no GPS signal) or relative (visibility of a few meters)
positioning. So, there is therefore great interest for a fully
automatic methodology for referencing GCP. The main ad-
vantage is to reduce the cost of manual measurements while
increasing the accuracy of the overall network by increasing
the number of GCP.
In this article, we present our experimental results on the
first steps of this methodology. First, we present our work on
artificial landmarks that serve as GCP and a deep learning al-
gorithm dedicated to automatic detection and identification of
these GCP in an underwater environment. Then we present an
original and operationnal methodology for quick in situ man-
ual measurements of GCP network which will be usefull to
qualify the future results of the fully automatic method. The
last section present the result obtained on several underwater
acquisition campaigns.
2. CONTEXT AND STATE OF THE ART
In close-range remote sensing, due to the short observation
distance, the global reconstruction of a scene is done from a
collection of local sub-scenes. This is even more marked in
the underwater environment because of the visibility reduced
to a few meters. In situ measurements of underwater GCP
are dedicated to physical parameters like depth, the distribu-
tion and distance between GCP, etc. These GCP are used to
qualify the precision of the model.
There are therefore two precision scales: local precision
(within the same sub-scene) and global precision (on the com-
plete scene) which notably includes slow drifts, which are
difficult to detect at the local scale. Thus, one can have for
example precise local reconstructions but an imprecise global
reconstruction. In the literature, the techniques to improve
precision are different depending on whether one is interested
in one or other of these details.
For local precision, the classic method consists of using
standard rules (alternating colored bands representing known
distances). For sites with a high visit rate and studied for sev-
eral days, such as archaeological sites, the scene is partially
or totally equipped with fixed or mobile grids [1].
For overall accuracy, the most widespread method, result-
ing from photogrammetric practices, consists of distributing
specific markers on the study area (weighted, screwed or even
sunken targets in the field). These targets can also be uniquely
colored and numbered, like archaeological labels. Then, mea-
surements are taken: the network of GCP thus obtained makes
it possible to check the coherence of the reconstructed global
scene.
Fig. 1. State of the art GCP and measurement method [6, 7].
In the underwater environment (see figure 1) where no
absolute positionning is directly available, it is only possi-
ble to measure relative distances betweens GCPs [2, 3, 4, 5].
Their depth can be measured absolutely with a sensor or rel-
atively with graduated vertical bars and level lines. When the
surface is close, it is possible to reference all these measure-
ments in an absolute coordinate system, by using vertical bars
equipped with a GPS sensor for example or by an external tri-
lateration from a fixed point.
The common problem with all of these methods is the hu-
man and logistical cost. Indeed, they make a lot of use of
divers who have limited access time to the site, whether by
meteorological, physical (the deeper the site, the shorter the
in situ time), administrative (number of dives per day) or lo-
gistics constraints.
To limit in situ manipulations, the characteristic points
used to form GCP could be natural, but the seabed has an ex-
traordinary diversity: it can be uniform (sand), or contains too
much information (reef), indistinguishable information (grass
or pebbles), information of variable quality (drop-off), etc. It
is therefore very difficult to make a priori assumptions about
the quality and distribution of the landmarks of an area. How-
ever, it is important to master this parameters because they im-
pact the quality and precision of the estimate. For example, a
rule is to favor angles around 60 degrees between landmarks
to optimize trilateration.
In addition, even if we had well distributed natural land-
marks, it is very complicated to describe them in a sufficiently
precise manner while remaining concise to reference them
and find them back on the images. The problem is even more
complex when you go deeper because the orientation of the
lighting completely changes the perception of the scene [8].
To overcome all these difficulties, we therefore chose to use
artificial landmarks rather than relying on natural landmarks.
3. AUTOMATIC DETECTION OF
CHARACTERISTIC POINTS
We created our own landmarks (see figure 2) based on aerial
work on automatic tracking [9], feedback from underwater
work from [10] and target detection work by computer vision
suitable for the underwater world [11].
To reduce the time required to equip the site and improve
Fig. 2. Artificial landmarks optimized for the etablishment of
underwater network of GCP detectable by computer vision.
its relevance, our landmarks allow us to estimate local and
global accuracy at the same time. It is a square subdivided
into 9 tiles of known size that can be used as a 2D yardstick.
Its corners, its center and those of its tiles can be localized in
a robust way both in situ and on the images.
As we want our method to be fully automatic over time,
we adapted the design of our landmarks to detection and iden-
tification algorithms. The periodic repetition of the same pat-
tern of standard rules generates indecision in an automatic
process and the dimensions of archaeological tags are too
punctual to ensure proper detection. Moreover, those state
of the art landmarks are not robust to partial masking (algae,
sand, fish, etc.) which is very common underwater.
Our landmark measure 20 ×20 cm to be easily detected
considering 2-3m observation distance and the choice of at-
tenuated colors on light tiles is the best compromise that we
have found to maintain a good contrast between the neighbor-
ing tiles to facilitate detection without saturating the sensors
whatever the lighting conditions. The diagram of the top, bot-
tom, right and left central tiles is unique and uses two comple-
mentary identification strategies (squares and arcs), whereas
the central number is mainly used for manual referencing.
The orientation of the landmark can be automatically deduced
thanks to the asymmetry of the colored tiles and the orienta-
tion of their letters.
For automatic detection, we are based on the deep learn-
ing algorithm YOLO (You Only Look Once) [12]. We use its
version 3, implemented in the darknet framework. The lower
layers of the network have been trained on the ImageNet base
for general object detection. This then allows us to use only a
small sample of manually annotated images (less than fifty) to
specialize the last layer on the detection of our landmarks in
our environnments (transfer learning). We train the network
on more than 1000 iterations to find the center and the width
of each landmark that can appear on the images. The main
advantages of YOLO over its competitors are its speed and
its robustness due its capacity to consider the whole image as
informative context and to learn generalizable representation
of object. In addition, YOLO only requires relatively reduced
on-board processing capacity resources, which allows its use
by small robotic platforms or very compact payloads handled
by divers.
Being able to carry out the detections in near real time
and in situ will make it possible, when equipping the field,
that the landmarks are optimally positioned by estimating the
global network from local points of view. This will also allow
to automatically index the acquired images by the numbers of
the detected landmarks and thus classify the image sequences
by a neighborhood criterion rather than by a purely temporal
criterion. It also alow a classification according to the angle
of passage (orientation of the landmarks) inside these clusters,
which allows to adjust the algorithm used for registration dur-
ing reconstruction.
4. NETWORK OF GCP
To assess the quality of the GCP network during the qualifi-
cation phase of the fully automatic method, we need in situ
manual measurements. But unlike aerial work, it is difficult
to count on an absolute positioning of each landmarks since
there is no referencing by GPS underwater. We have seen
in the state of the art that the positioning of the landmarks
is done by a succession of relative measurements (distances
between the landmarks) and by trilateration between neigh-
boring landmarks of known depth (each landmark must be
connected to at least three other). The classic methodology
used in underwater archeology is particularly costly in divers
resources. We have therefore developed an original method
to optimize the use of dive time and make the results more
robust.
The originality of this manual method is to postpone the
measurement of the distance itself after the dive. For this, the
diver has strands, all distinguishable by a single number ac-
cording to a system inspired by the Inca quipu, a submersible
slate and a reel of unwinding rope, the end of which is con-
nected to a weight.
The diver positions the latter on the corner of a landmark
and unwinds the rope to the corner of a second landmark
nearby. He then marks the distance by fixing one of the
strands on the rope and notes the references on the tablet:
number of the strand, number of both landmarks and their
corresponding corner letter (for example ”2: 9A-5M”). The
depth of each landmark is measured with a sensor and associ-
ated with the corresponding number. Back from the mission,
the measurements are made a posteriori on the rope for each
strand to build the network of GCP.
One of the advantages of this method is to limit errors due
to human measurements in often complicated diving condi-
tions (cold, current, visibility, etc.). Another one is the possi-
bility to keep an analog version of the measurements and then
the metrics can then be checked as many times as necessary.
5. RESULTS
In order to evaluate the results of our automatic detection al-
gorithm, we have built a database of around 5000 images from
our landmarks taken under different conditions (from clear to
poor visibility) at sea and in pools. Approximately 33% of
Fig. 3. Some examples of partially masked or cut landmarks.
these images contain a landmarks, 27% contain two, and 25%
three or four. The rest (15%) contains no landmark at all. On
average, 15% of the landmarks on the images are cut (edges)
and 35% are partially obstructed (algae, fish, etc.). Our deep
learning algorithm detects 84% of the landmarks. Given the
size of our landmarks, a GSD (ground sample distance) of 20
to 50 mm is necessary to allow detection. If we remove the
landmarks that do not fit this case, our algorithm then has a
detection rate of 90%. The false detection rate is less than 1%
and the multiple detection rate for the same object is less than
2%.
About 20% of the detected landmarks are not identifiable,
either because the GSD does not allow it (2 to 5 mm GSD
is need for identification), because the lighting is problematic
(saturation or absence of light) or that the obstruction or cut
is too important. The algorithm manages to correctly iden-
tify the identifiable landmarks with a success rate greater than
86%. Finally, we note that detection and identification are
very robust to partial obstruction because more than 85% of
the landmarks in this case are correctly classified.
We equipped the field and carried out the measurements
(distances and depths) between landmarks to form a network
with our methodology on three study areas located on the
Mediterranean coast: Cap de Nice (300m2area, depths from
4 to 15m, pebbles, rocks, posidonia meadows and scree), Cap
Ferrat (60m2, depths from 1.5 to 4m, rocks and posidonia
meadow) and Collioure (120m2, depths from 2 to 3m, posi-
donia meadow, dead matte, rocks, broken pipes and gravel).
In the first area, we experimented with several configura-
tions of the GCP network, starting from the distributions in
circles or ellipses recommended by [1], to which secondary
landmarks are added inside. We observed that a compromise
between a spatial and vertical equidistribution (marking of the
different depth levels) makes it possible to extract rich and
very useful information for the phase of analysis on the three-
dimensional morphology of the study area.
The landmarks are placed on the seabed using a 1kg
weight so that they are not moved by swell or currents during
the mission. To equip the Cap de Nice area, we estimate that
around thirty landmarks are necessary and that ten is more
than enough for the Collioure and Cap Ferrat areas, which
corresponds on average to a landmark for 10m2.
We counted about 1 minute on average across all the
tested areas so that a single diver could take a complete mea-
Fig. 4.In situ manual mesurements of the network of under-
water landmarks.
1
2
3
4
5
6
7
9
10
④ 3,50
④ 3,50
⑤ 3,04
③ 3,22
⑩ 3,38
① 2,17
⑧ 2,48
② 2,41
⑭ 2,85
⑬ 4,13
⑪ 2,60
⑨ 3,95
⑫ 4,63
⑨ 3,95
⑦ 4,56
⑦ 4,55
⑥ 5,30
①+10
2,27
⑮ 1,65
⑭ 2,85
⑪ 2,60
⑮-60 1,05
⑭
+15
3,0
2,85
2,75
2,60
4,12
2,79
2,59
2,33
2,99
1,70
3,49
3,03
3,49
2,47
4,54
5,29
4,29
3,57
3,63
4,27
3,37
3
7
5
1
2
10
N
10
2,51
2,50
9
3,3
3,2
3,30
4,54
3,5
2
3,29
4,53
3,0
6
2,97
2,92
2,14
3,0
1
2,08
3,84
3,0
4
3,83
3,70
3,70
1,60
2,63
1,81
3,04
2,69
5,08
4,88
3
6,36
6,37
N
Fig. 5. Collioure (left) and Cap Ferrat (right) GCP networks.
surement (less than 15m in length) including note taking
(figure 4). The same measurements were made with the clas-
sic tape measure method: 2 divers are then mobilized and it
takes more than 2 minutes on average per measurement.
With such a method, we can consider errors in the mea-
surements of around 2% of the distance (i.e. an accuracy of
+/- 8mm for landmarks 4m apart), which is comparable with
the results usually obtained with ribbons meters [13]. The
reconstructed diagrams of the Collioure and Cap Ferrat net-
works are presented in figure 5.
6. CONCLUSION
We have proposed in this article a methodology to optimize
the creation of networks of underwater GCP, both in speed
and in quality, and to prepare the automation of the process.
We have shown that an appropriate design of landmarks for
underwater operational conditions allows automatic detection
and identification on images in near real time by a deep learn-
ing algorithm with a success rate greater than 77%. Most
of the failures are linked either to the lighting (saturation) or
to a distance that is too far away (GSD insufficient for iden-
tification). The robustness of detection and identification by
our algorithm to partial obstructions is greater than 86%. This
point is important as in underwater environment we have seen
that 50% of the landmarks are concerned.
We have also proposed a method to simplify in situ man-
ual measurements which consists in recording distance infor-
mation via ropes and reporting metric measurements offsite.
The procedure, less costly in time, can then be carried out by
a reduced team for a precision equal to conventional methods.
7. REFERENCES
[1] A. Bowens, Underwater archaeology: the NAS guide to
principles and practice, 2011.
[2] P. Drap et al, “A Photogrammetric Process Driven By
an Expert System: a New Approach for Underwater Ar-
chaeological Surveying Applied to the ’Grand Ribaud
F’ Etruscan Wreck,” in CVPR, 2003, vol. 1.
[3] J. Henderson et al, “Mapping Submerged Archaeolog-
ical Sites using Stereo-Vision Photogrammetry,” Inter-
national Journal of Nautical Archaeology, vol. 42, no.
2, pp. 243–256, 2013.
[4] S. Rubin et al, “Scuba Surveys to Assess Effects of El-
wha Dam Removal on Shallow, Subtidal Benthic Com-
munities,” in Elwha River Science Symposium, 2011,
pp. 41–43.
[5] D. Skarlatos et al, “Precision Potential of Underwa-
ter Networks for Archaeological Excavation Through
Trilateration and Photogrammetry,” ISPRS, vol. XLII-
2/W10, pp. 175–180, 2019.
[6] E. Diamanti et al, “Geometric Documentation of Under-
water Archaeological Sites,” Geoinformatics FCE CTU,
vol. 11, pp. 37–48, 2013.
[7] C. Balletti et al, “Underwater Photogrammetry and 3D
Reconstruction of Marble Cargos Shipwreck,” ISPRS,
vol. XL-5/W5, 2015.
[8] S. Williams et al, “Repeated AUV Surveying of Urchin
Barrens in North Eastern Tasmania,” in ICRA, 2010, pp.
293–299.
[9] N. A. Matthews, “Aerial and Close-Range Photogram-
metric Technology: Providing Resource Documenta-
tion, Interpretation, and Preservation,” Tech. Rep.,
2008.
[10] D. Skarlatos et al, “Photogrammetric Approaches
for the Archaeological Mapping of the Mazotos Ship-
wreck,” in STIAC, 2010.
[11] L. Avanthey et al, “Light-weight tools to perform local
dense 3d reconstruction of shallow water seabed,” Sen-
sors, vol. 16, no. 5, pp. 712–742, 2016.
[12] J. Redmon et al, “Yolov3: An incremental improve-
ment,” CVPR, vol. 1804.02767, 2018.
[13] P. Holt, “An Assessment of Quality in Underwater Ar-
chaeological Surveys Using Tape Measurements,” Inter-
national Journal of Nautical Archaeology (IJNA), vol.
32, no. 2, pp. 246–251, 2003.