Figure 1 - uploaded by Yuri Rzhanov
Content may be subject to copyright.
Schematic of submersible Alvin. Three video cameras have frame size 720 by 480 pixels. Video is recorded on DVCam recorders.
Source publication
Seafloor imagery is an important tool for scientists engaged in quantifying geological and biological processes operating on the deep-ocean floor. Large-scale mosaics of seafloor imagery have significant advantages over individual still photographs and video footage as they are able to capture large areas while retaining sufficient resolution to id...
Citations
... According to a survey [Gra17] the first publications on seabed segmentation task (also termed seafloor classification) appeared 25 years ago and are still scarce, the common ground between them being the use of "hand-crafted" image features and traditional machine learning algorithms, for example, random forest [Rim18]. New deep learning architectures of neural networks could replace image features and help analyze images more effectively, accurately (a) CCOM [Rzh06] (b) LAPM v1 [Mar13] (c) Microsoft ICE 2 (d) ArcSoft PM 6 (e) Sample frames Figure 1: Converting underwater video material (from Atlantic Ocean) to images: marine mosaics for benthic studies (a-b), commercial software mosaics for panoramic view (c-d), and sample of equally spaced frames (e). and quickly than ever before. ...
... In this study, the video mosaicking method, developed by the Center for Coastal and Ocean Mapping [Rzh06], was used. The method consists of the following steps: ...
Deep learning applications are attracting considerable interest nowadays and image analysis pipelines are no exception. Benthic studies often rely on the subjective evaluation of video material recorded using underwater drones. The demand for automatic image segmentation and quantitative evaluation arises due to the large volume of video data collected. This study performed a semantic segmentation task by training the PSPNet architecture with ResNet-34 backbone for 50 epochs using imagery prepared by simply extracting a few video frames or stitch- ing a multitude of frames into a large 2D mosaic. Mosaicking is a particularly resource-intensive step, therefore, the possibility to skip such preprocessing would result in a more rapid analysis. The effect on the resulting seg- mentation quality was investigated by estimating the seabed coverage of three classes (Furcellaria lumbricalis, Mytilus edulis trossulus, and boulders) in a video material obtained from the Baltic Sea. Segmentation success, measured by intersection over union, varied between 0.56 and 0.84, usually slightly better for frames than for the mosaic overall. Absolute differences in estimated coverage were negligible (mosaic vs. frames): 0.24% vs. 1.26% for furcellaria, 0.44% vs. 2.46% for mytilus, and 4.02% vs. 2.06% for boulders. Due to the differences between predicted coverage and the mosaic-based ground truth being in an acceptable range, the findings suggest that the mosaicking step could be safely skipped in favor of a few equally spaced sample frames.
... Each frame was enhanced for more accurate pair-wise registration and video mosaics were produced using original non-enhanced video footage and pair-wise registration data. Algorithms for video mosaicking have been developed by Rzhanov et al. [9,10] . Taxonomic identification of benthic species was carried out with specialists' help using a digital catalog, in which more than 40 biological (fish, benthic invertebrates, algae, etc.) and physical (stones, substrate, burrows, footprints, etc.) categories were identified. ...
Underwater imagery is widely used for a variety of applications in marine biology and environmental sciences, such as classification and mapping of seabed habitats, marine environment monitoring and impact assessment, biogeographic reconstructions in the context of climate change, etc. This approach is relatively simple and cost-effective, allowing the rapid collection of large amounts of data. However, due to the laborious and time-consuming manual analysis procedure, only a small part of the information stored in the archives of underwater images is retrieved. Emerging novel deep learning methods open up the opportunity for more effective, accurate and rapid analysis of seabed images than ever before.
We present annotated images of the bottom macrofauna obtained from underwater video recorded in Spitsbergen island's European Arctic waters, Svalbard Archipelago. Our videos were filmed in both the photic and aphotic zones of polar waters, often influenced by melting glaciers. We used artificial lighting and shot close to the seabed (<1 m) to preserve natural colours and avoid the distorting effect of muddy water. The underwater video footage was captured using a remotely operated vehicle (ROV) and a drop-down camera. The footage was converted to 2D mosaic images of the seabed. 2D mosaics were manually annotated by several experts using the Labelbox tool and co-annotations were refined using the SurveyJS platform.
A set of carefully annotated underwater images associated with the original videos can be used by marine biologists as a biological atlas, as well as practitioners in the fields of machine vision, pattern recognition, and deep learning as training materials for the development of various tools for automatic analysis of underwater imagery.
... Underwater video footage of 9 sites, ranging in depth from ~ 2 to ~ 17 m, was collected by swimming in a lawnmower pattern along transects placed on the seafloor between September 4 and 9, 2016. Overlapping still frames were extracted from the video and stitched together into a single composite image using texture based video mosaic ( Rzhanov et al. 2006; Gu and Rzhanov 2006). To create a species map for each dive site, each mosaic was georeferenced and viewed on a high-resolution computer screen. ...
Topobathymetric lidar is becoming an increasingly valuable tool for benthic habitat mapping, enabling safe, efficient data acquisition over coral reefs and other fragile ecosystems. In 2014, a novel topobathymetric lidar system, the Experimental Advanced Airborne Research Lidar-B (EAARL-B), was used to acquire data in priority habitat areas in the U.S. Virgin Islands (USVI), spanning the 0–44-m depth range. In this study, new algorithms and procedures were developed for generating seafloor relative reflectance, along with a suite of shape-based waveform features from EAARL-B. Waveform features were then correlated with percent cover of coral morphologies, domed and branched, and total cover of hard and soft corals. Results show that the EAARL-B can be used to produce useful seafloor relative reflectance mosaics and also that the additional waveform shape-based features contain additional information that may benefit habitat classification—specifically, to aid in distinguishing among hard corals and their coral morphologies, domed and branched. Knowing the spatial extent of changes in coral communities is important to the understanding of resiliency of coral reefs under stress from human impacts.
... Divers moved along transects on the seafloor in a lawnmower pattern at an almost constant height above the seafloor (1.5-2.5 m) and a speed appropriate for generating contiguous images. Video transects of each site were stitched together into a single composite image (100 m 2 ) of the entire site using texture-based video mosaic (e.g., scene, shape) assembly ( Rzhanov et al. 2006). Still images (or video frames) were co-registered and positioned in a common frame of reference. ...
Landscape patterns created by the structure and form of foundational species shape ecological processes of community assembly and trophic interactions. In recent years, major shifts in foundation species have occurred in multiple ecosystems. In temperate marine systems, many kelp beds have shifted to turf macroalgae habitats with unknown consequences on seascape patterns or changes in the ecological processes that maintain communities. We investigated the effect of turf macroalgae on seascape patterns in three habitats dominated by kelp and turf macroalgae and those that have mixed species composition. We also examined decadal elevations in temperature with known growth and reproductive phenology of kelp and turf macroalgae to provide a mechanistic understanding of the factors that will continue to shape these seascapes. Our results indicate that turf macroalgae produce a more heterogeneous habitat with greater primary free space than those that are mixed or dominated by kelp. Further, we examined the relationship between seascape patterns and richness and abundance of fishes in each habitat. Results showed that patch size was positively related to the abundance of fish in habitat types, suggesting that turf‐induced heterogeneity may lead to fewer observed fishes, specifically the mid‐trophic level species, cunner, in these habitats. Overall, our results suggest that persistence of this habitat is facilitated by increasing temperature that shorten the phenology of kelps and favor growth and reproduction of turf macroalgae that make them poised to take advantage of free space, regardless of season.
... Iš viso sukurti 7 reljefo modeliai, jų bendras plotas 10,1 km 2 (2 lent.). Iš videoįrašų buvo sudarytos mozaikos (Rzhanov et al. 2006), kurios panaudotos dugno vaizduose matomų požymių atlasui sudaryti. Atlasą sudaro 50-60 biologinių (moliuskų, vėžiagyvių, dygiaodžių ir kt.) bei fizinių (substrato tipai, urvai ir kt.) požymių. ...
Jūros krantas – viena kaičiausių geosferos vietų žemėje. Nuolat vykstant sąnašų apykaitai tarp jūros ir kranto, stipresnės audros metu kranto profilis gali pakisti iš esmės. Tačiau, nepaisant sparčiai kintančios aplinkos, krantą sudarantys morfologiniai elementai (paplūdimys, kopagūbris) išlieka. Šio darbo tikslas – nustatyti, kaip erozinėse ir akumuliacinėse kranto atkarpose kinta morfometriniai paplūdimio rodikliai. Šių tikslų įgyvendinimui pasirinkti erozinis (Melnragė I) ir akumuliacinis (Smiltynė) kranto ruožai, kuriuose nuo 2002 m., baigus Klaipėdos uosto molų rekonstrukcijos darbus bei išgilinus įplaukos kanalą iki 14,5 m, suintensyvėjo kranto arda šiauriau molų ir akumuliacija – piečiau. Nuo 2002 m. iki 2017 m. kartą per metus (pavasarį) šiauriau ir piečiau uosto molų buvo atliekami kranto skersinės niveliacijos darbai, kurių metu nustatyta kranto profilio kaita, kranto linijos padėties kaita, paplūdimio pločio kaita, smėlio, išplauto iš paplūdimio, kaita bei smėlio kiekio, susikaupusio paplūdimyje, kaita. Nustatyta, kad, nepaisant didelio kranto linijos padėties bei sąnašų kiekio paplūdimyje kaitos, paplūdimys stengiasi išlaikyti pastovius savo morfometrinius rodiklius. Erozinio kranto atkarpoje paplūdimio forma palaikoma naudojant smėlio atsargas, sukauptas kopagūbryje. Akumuliacinio kranto atkarpoje – formuojant užuomazgines kopas viršutinėje paplūdimio dalyje.
... Furthermore, an increased computational cost is implied. Where navigation data are available, it can be incorporated into registration to improve robustness [28], [42] As an exhaustive search for pairs of frames for homography estimation is quadratic in the number of frames, sate-of-the-art techniques use domain-specific knowledge to make this process more efficient. As image data are generally captured by video cameras, homography estimation between successive frames can be used to estimate an initial camera trajectory than can be used to detect additional overlapping pairs of frames [20], [27], [28]. ...
... Center-weighting is an effective means of dealing with vignetting as only the well-lit central regions are kept in the mosaic [16]. Many underwater mosaicing algorithms [20], [27], [42] also use forms center-weighting as underwater mosaics are typically built from downward facing cameras and so the illumination pattern caused by artificial lighting resembles a vignetting artefact. However, cameras in UWTV surveys are typically slightly forward-facing and so the illumination pattern in video frames is not symmetric and turbidity of the water can cause blurring of texture in the top portion of the frame. ...
This paper proposes an algorithm for mosaicing videos generated during stock assessment of seabed-burrowing species. In these surveys, video transects of the seabed are captured and the population is estimated by counting the number of burrows in the video. The mosaicing algorithm is designed to process a large amount of video data and summarize the relevant features for the survey in a single image. Hence, the algorithm is designed to be computationally inexpensive while maintaining a high degree of robustness. We adopt a registration algorithm that employs a simple translational motion model and generates a mapping to the mosaic coordinate system using a concatenation of frame-by-frame homographies. A temporal smoothness prior is used in a maximum a posteriori homography estimation algorithm to reduce noise in the motion parameters in images with small amounts of texture detail. A multiband blending scheme renders the mosaic and is optimized for the application requirements. Tests on a large data set show that the algorithm is robust enough to allow the use of mosaics as a medium for burrow counting. This will increase the verifiability of the stock assessments as well as generate a ground truth data set for the learning of an automated burrow counting algorithm.
... Furthermore, an increased computational cost is implied. Where navigation data are available, it can be incorporated into registration to improve robustness [28], [42] As an exhaustive search for pairs of frames for homography estimation is quadratic in the number of frames, sate-of-the-art techniques use domain-specific knowledge to make this process more efficient. As image data are generally captured by video cameras, homography estimation between successive frames can be used to estimate an initial camera trajectory than can be used to detect additional overlapping pairs of frames [20], [27], [28]. ...
... Center-weighting is an effective means of dealing with vignetting as only the well-lit central regions are kept in the mosaic [16]. Many underwater mosaicing algorithms [20], [27], [42] also use forms center-weighting as underwater mosaics are typically built from downward facing cameras and so the illumination pattern caused by artificial lighting resembles a vignetting artefact. However, cameras in UWTV surveys are typically slightly forward-facing and so the illumination pattern in video frames is not symmetric and turbidity of the water can cause blurring of texture in the top portion of the frame. ...
Harvesting the commercially significant lobster, Nephrops norvegicus, is a multimillion dollar industry in Europe. Stock assessment is essential for maintaining this activity but it is conducted by manually inspecting hours of underwater surveillance videos. To improve this tedious process, we propose an automated procedure. This procedure uses mosaics for detecting the Nephrops, which improves visibility and reduces the tedious video inspection process to the browsing of a single image. In addition to this novel application approach, key contributions are made for handling the difficult lighting conditions in these kinds of videos. Mosaics are built using 1-10 minutes of footage and candidate Nephrops regions are selected using image segmentation based on local image contrast and colour features. A K-Nearest Neighbour classifier is then used to select the respective Nephrops from these candidate regions. Our final decision accuracy at 87.5% recall and precision shows a corresponding 31.5% and 79.4% improvement compared with previous work [1].
... Photo-maps, also known as mosaics, are assembled by stitching together images collected from an air [1], underwater [2,3,4] or ground vehicle. The images are typically collected from a downward looking camera and it is implicitly assumed that there is an underlying dominant plane which the camera is observing normally. ...
An autonomous aerial or underwater robot can create photo-maps using a downward looking camera to not only compute its odometry visually but also to provide a useful and intuitively understandable representation of the environment explored by it. The improved Fourier Mellin Invariant (iFMI) registration is a spectral registration method, which has specific benefits, especially high robustness in featureless scenarios, but it only allows registrations of 2D translations, rotation, and scale. The method is extended here to incorporate tilt using parallax information. To this end, we extend the well-known four-point algorithm for planar homography. We show that using the decomposition of the planar homography to compute the tilt is very noise-prone, and propose a way of increasing this accuracy based on a parallax to noise metric. Although our general approach can be used with local scale invariant image features, we implement the tilt-correction based on an extension of our frequency-based approach to determine the image motion-field. Two experiments are presented to show the efficacy and applicability of our approach: An analysis of a simulated data set with ground truth is used to quantify the robustness of our novel method relative to the same four-point method using only SIFT (Scale Invariant Feature Transform) features. A second data set is used to present similar results with real-world data.
... This is related to visual odometry, because the vehicle motion is estimated in this process. Finding a template in an image is also known as image registration [Fitch et al., 2005, Stricker, 2001, Dorai et al., 1998, Brown, 1992, Alliney and Morandi, 1986, Lucas and Kanade, 1981, Pratt, 1973, Rzhanov et al., 2000, Rzhanov et al., 2006. But since the robot usually does some non-trivial motion, finding out the overlapping parts of images is not sufficient to solve this task. ...
Being able to generate maps is a significant capability for mobile robots. Measuring the performance
of robotic systems in general, but also particularly of their mapping, is important in different as-
pects. Performance metrics help to assess the quality of developed solutions, thus driving the research
towards more capable systems. During the procurement and safety testing of robots, performance
metrics ensure comparability of different robots and allow for the definition of standards.
In this thesis, evaluation methods for the maps produced by robotic systems are developed. Those
maps always contain errors, but measuring and classifying those errors is a non trivial task. The
algorithm has to analyze and evaluate the maps in a systematic, repeatable and reproducible way. The
problem is approached systematically: First the different terms and concepts are introduced and the
state of the art in map evaluation is presented.
Then a special type of mapping using video data is introduced and a path-based evaluation of
the performance of this mapping approach is made. This evaluation does not work on the produced
map, but on the localization estimates of the mapping algorithm. The rest of the thesis then works on
classical two-dimensional grid maps.
A number of algorithms to process those maps are presented. An Image Thresholder extracts
informations about occupied and free cells, while a Nearest Neighbor Remover or an Alpha Shape
Remover are used to filter out noise from the maps. This all is needed to automatically process the
maps.
Then the first novel map evaluation method, the Fiducial algorithm, is developed. In this place-
based method, artificial markers that are distributed in the environment are detected in the map. The
errors of the positions of those markers with respect to the known ground truth positions are used to
calculate a number of attributes of the map. Those attributes can then be weighted according to the
needs of the application to generate a single number map metric.
The main contribution of this thesis is the second novel map evaluation algorithm, that uses a graph
that is representing the environment topologically. This structure-based approach abstracts from all
other information in the map and just uses the topological information about which areas are directly
connected to asses the quality of the map. The different steps needed to generate this topological
graph are extensively described. Then different ways to compare the similarity of two vertices from
two graphs are presented and compared. This is needed to match two graphs to each other - the graph
from the map to be evaluated and the graph of a known ground truth map. Using this match, the same
map attributes as those from the Fiducial algorithm can be computed. Additionally, another interesting
attribute, the brokenness value, can be determined. It counts the large broken parts in the map that
internally contain few errors but that have, relative to the rest of the map, an error in the orientation
due to a singular error during the mapping process.
Experiments made on many maps from different environments are then performed for both map
metrics. Those experiments show the usefulness of said algorithms and compare their results among
each other and against the human judgment of maps.
... A Cartesian reference system was adopted based only on a mosaic produced from the frame-imagery survey. Although there are several well-known ortho-mosaicking procedures available in commercial software suites (ASPRS 2004), it was more convenient to write a mosaic code (Rzhanov et al. 2006;Pe'eri and Rzhanov 2008) with complete control over the algorithms and the mosaic output products. The mosaicking process includes three main steps: (1) attitude correction, (2) pair-wise alignment and (3) global adjustment. ...
... where x and y are the horizontal shifts along the across track, X, and the along track, Y, axes (lateral camera motion); S is the scale factor (vertical camera shift perpendicular to the imaged surface); and ϕ is the rotation angle around the camera optical axis. Other available transforms were found not be suitable for this study (Rzhanov et al. 2006). Although the mosaic quality of an eight-parameter projective transform (also known as perspective transform) is the highest, the mosaic may not result parallel to the ground. ...
More and more littoral surveys are conducted with aerial platform sensor suites that include a hyperspectral pushbroom sensor and a frame camera. However, in some cases, data from auxiliary sensors may contain errors or are not available. For many research groups, a high-accuracy registration of the multi-sensor data relative to each other is essential, while absolute geo-location of each individual measurement is not. A co-registration procedure was developed for pseudo ortho-rectification of the hyperspectral imagery and to remove spatial distortions caused by the aircraft's trajectory during the survey based on a flat-earth assumption. This image-processing approach utilizes the aerial frame imagery as a reference. Each hyperspectral scan line is co-registered into the frame imagery coordinate system. The performance of the procedure was evaluated using hyperspectral imagery collected over northern New Hampshire and southern Maine. The evaluation results showed that the procedure is robust enough to pseudo ortho-rectify hyperspectral imagery over coastal areas and to remove significant spatial distortions.