ArticlePDF Available

Abstract and Figures

Component trees are region-based representations that encode the inclusion relationship of the threshold sets of an image. These representations are one of the most promising strategies for the analysis and the interpretation of spatial information of complex scenes as they allow the simple and efficient implementation of connected filters. This work proposes a new efficient hybrid algorithm for the parallel computation of two particular component trees—the max- and min-tree—in shared and distributed memory environments. For the node-local computation a modified version of the flooding-based algorithm of Salembier is employed. A novel tuple-based merging scheme allows to merge the acquired partial images into a globally correct view. Using the proposed approach a speed-up of up to 44.88 using 128 processing cores on eight-bit gray-scale images could be achieved. This is more than a five-fold increase over the state-of-the-art shared-memory algorithm, while also requiring only one-thirty-second of the memory.
Content may be subject to copyright.
A preview of the PDF is not available
... Many algorithms have been developed for speeding-up the max-tree computation. So far, the proposed optimization techniques can roughly come under one of these three categories: (a) algorithmic optimizations, i.e., choosing between a top-down or a bottom-up construction with adapted data structures [18], [24], [25]; (b) thread level parallelism, i.e., classical parallelism for multiprocessors with shared memory (SMP) [26], [27], [28]; (c) distributed computing, i.e., joint max-tree computation between distributed memory [29], [30]. To the best of our knowledge, this is the first time a max-tree algorithm is proposed for massively parallel architectures and fits the SIMT paradigm of GPUs. ...
... The first parallel algorithms [26], [27], [28] were using this algorithm on shared-memory systems with scalable results (almost linear in the number of threads). As images grew in size, the same strategies were adopted for a distributed computation of the max-tree [29], [36] with the extra burden of minimizing memory exchanges between (cluster) nodes using border max-trees. This idea is even pushed further in [30], [37] with a distributed max-tree representation based on border max-trees that avoids storing the final tree in shared-memory and enables distributed tree processing. ...
... In [29], the authors suggest duplicating the tile boundaries (called halo, see figure 10) so that CONNECT is called in global memory on two nodes with the same levels. The CONNECT procedure could be seen as two stages. ...
Article
Full-text available
In Mathematical Morphology, the max-tree is a region-based representation that encodes the inclusion relationship of the threshold sets of an image. This tree has proved useful in numerous image processing applications. For the last decade, work has led to improving the construction time of this structure; mixing algorithmic optimizations, parallel and distributed computing. Nevertheless, there is still no algorithm that benefits from the computing power of the massively parallel architectures. In this work, we propose the first GPU algorithm to compute the max-tree. The proposed approach leads to significant speed-ups, and is up to one order of magnitude faster than the current State-of-the-Art parallel CPU algorithms. This work paves the way for a max-tree integration in image processing GPU pipelines and real-time image processing based on Mathematical Morphology. It is also a foundation for porting other image representations from Mathematical Morphology on GPUs.
... Recently, these tools have been improved to process Giga and Tera-Scale data sets using shared-memory [7,8] and distributed-memory techniques [9,10], or a combination of both [11,12]. These approaches typically build upon a divide-and-conquer approach where the data set is split into several tiles. ...
... In [4,6,8], the authors investigate distributed memory algorithms to compute min and max trees for terabytes images. In [5], computation of minimum spanning trees of streaming images is considered. ...
Preprint
Full-text available
Binary Partition Hierarchies (BPH) and minimum spanning trees are fundamental data structures involved in hierarchical analysis such as quasi-flat zones or watershed. However, classical BPH construction algorithms require to have the whole data in memory, which prevent the processing of large images that cannot fit entirely in the main memory of the computer. To cope with this problem, an algebraic framework leading to a high level calculus was introduced allowing an out-of-core computation of BPHs. This calculus relies on three operations: select, join, and insert. In this article, we introduce three efficient algorithms to perform these operations providing pseudo-code and complexity analysis.
... The parallel implementation is based on the flooding non-recursive Salembier's algorithm [30], the subtree merging procedure described in [31] and the concurrent direct filter [32]. Two optimisations are introduced to obtain higher performance. ...
Article
Full-text available
Machine protection is a core task of real-time image diagnostics aiming for steady-state operation in nuclear fusion devices. The paper evaluates the applicability of the newest low-power NVIDIA Jetson Xavier NX platform for image plasma diagnostics. This embedded NVIDIA Tegra System-on-a-Chip (SoC) integrates a Graphics Processing Unit (GPU) and Central Processing Unit (CPU) on a single chip. The hardware differences and features compared to the previous NVIDIA Jetson TX2 are signified. Implemented algorithms detect thermal events in real-time, utilising the high parallelism provided by the embedded General-Purpose computing on Graphics Processing Units (GPGPU). The performance and accuracy are evaluated on the experimental data from the Wendelstein 7-X (W7-X) stellarator. Strike-line and reflection events are primarily investigated, yet benchmarks for overload hotspots, surface layers and visualisation algorithms are also included. Their detection might allow for automating real-time risk evaluation incorporated in the divertor protection system in W7-X. For the first time, the paper demonstrates the feasibility of complex real-time image processing in nuclear fusion applications on low-power embedded devices. Moreover, GPU-accelerated reference processing pipelines yielding higher accuracy compared to the literature results are proposed, and remarkable performance improvement resulting from the upgrade to the Xavier NX platform is attained.
... On the contrary, merging trees is not a straightforward process due to the memory and computational costs brought by the spatio-temporal connectivity. Existing methods to build a max-tree usually proceed by merging trees corresponding to some small parts in the image separately [61], while tree of shapes [30] or α-tree [68] have also led to parallel algorithms for merging and [107] deals with extreme dynamic range data by merging sub-trees of sub-images. Algorithm 2 provides the pseudo-code for merging two max-trees. ...
Thesis
Full-text available
Although morphological hierarchies are today a well-established framework for single frame image processing, their extension to time-related data remains largely unexplored. This thesis aims to tackle the analysis of satellite image time series with tree-based representations. To do so, we distinguish between three kinds of models, namely spatial, temporal and spatio-temporal hierarchies. For each model, we propose a streaming algorithm to update the tree when new images are appended to the series. Besides, we analyze the structural properties of the different tree building strategies, thus requiring some projection methods for the spatio-temporal tree in order to obtain comparable structures. Then, trees are compared according to their node distribution, filtering capability and cost, leading to a superiority of the spatio-temporal tree (a.k.a. space-time tree). Hence, we review spatio-temporal attributes, including some new ones, that can been extracted from the space-time tree in order to compute some multiscale features at the pixel or image level. These attributes are finally involved in tools such as filtering and pattern spectrum for various remote sensing based applications.
... Ye, Liu [31] reported on a raster dataset cleaning and reconstitution multi-grid architecture for remote sensing monitoring of vegetation dryness by using several operations on different types of datasets. Gotz, Cavallaro [32] proposed a hybrid algorithm for the parallel computation of two particular trees. The flooding-based algorithm was employed for the node-local computation, while the tuple-based merging scheme was adopted to merge the partial images into a globally correct view. ...
Article
Full-text available
The volume of remote sensing images continues to grow as image sources become more diversified and with increasing spatial and spectral resolution. The handling of such large-volume datasets, which exceed available CPU memory, in a timely and efficient manner is becoming a challenge for single machines. The distributed cluster provides an effective solution with strong calculation power. There has been an increasing number of big data technologies that have been adopted to deal with large images using mature parallel technology. However, since most commercial big data platforms are not specifically developed for the remote sensing field, two main issues exist in processing large images with big data platforms using a distributed cluster. On the one hand, the quantities and categories of official algorithms used to process remote sensing images in big data platforms are limited compared to large amounts of sequential algorithms. On the other hand, the sequential algorithms employed directly to process large images in parallel over a distributed cluster may lead to incomplete objects in the tile edges and the generation of large communication volumes at the shuffle stage. It is, therefore, necessary to explore the distributed strategy and adapt the sequential algorithms over the distributed cluster. In this research, we employed two seed-based image segmentation algorithms to construct a distributed strategy based on the Spark platform. The proposed strategy focuses on modifying the incomplete objects by processing border areas and reducing the communication volume to a reasonable size by limiting the auxiliary bands and the buffer size to a small range during the shuffle stage. We calculated the F-measure and execution time to evaluate the accuracy and execution efficiency. The statistical data reveal that both segmentation algorithms maintained high accuracy, as achieved in the reference image segmented in the sequential way. Moreover, generally the strategy took less execution time compared to significantly larger auxiliary bands and buffer sizes. The proposed strategy can modify incomplete objects, with execution time being twice as fast as the strategies that do not employ communication volume reduction in the distributed cluster.
Chapter
Binary Partition Hierarchies (BPH) and minimum spanning trees are fundamental data structures involved in hierarchical analysis such as quasi-flat zones or watershed. However, classical BPH construction algorithms require to have the whole data in memory, which prevent the processing of large images that cannot fit entirely in the main memory of the computer. To cope with this problem, an algebraic framework leading to a high level calculus was introduced allowing an out-of-core computation of BPHs. This calculus relies on three operations: select, join, and insert. In this article, we introduce three efficient algorithms to perform these operations providing pseudo-code and complexity analysis.
Article
The High-Performance and Disruptive Computing in Remote Sensing (HDCRS) Working Group (WG) was recently established under the IEEE Geoscience and Remote Sensing Society (GRSS) Earth Science Informatics (ESI) Technical Committee to connect a community of interdisciplinary researchers in remote sensing (RS) who specialize in advanced computing technologies, parallel programming models, and scalable algorithms. HDCRS focuses on three major research topics in the context of RS: 1) supercomputing and distributed computing, 2) specialized hardware computing, and 3) quantum computing (QC). This article presents these computing technologies as they play a major role for the development of RS applications. The HDCRS disseminates information and knowledge through educational events and publication activities which will also be introduced in this article.
Chapter
Binary partition hierarchies and minimum spanning trees are key structures for numerous hierarchical analysis methods, as those involved in computer vision and mathematical morphology. In this article, we consider the problem of their computation in an out-of-core manner, i.e., by minimizing the size of the data structures that are simultaneously needed at the different computation steps. Out-of-core algorithms are necessary when the data are too large to fit entirely in the main memory of the computer, which can be the case with very large images in 2-, 3-, or higher dimension space. We propose a new algebraic framework composed of four main operations on hierarchies: edge-addition, select, insert, and join. Based on this framework, we propose and establish the correctness of an out-of-core calculus for binary partition hierarchies and for minimum spanning trees. First applications to image processing suggest the practical efficiency of this calculus.
Conference Paper
Full-text available
A new paradigm for large-scale, unsupervised object de- tection on remote sensing imagery is proposed, which relies on the synergy of hierarchical image representations and deep learning. The proposed paradigm: (a) reduces the search space to a set of candidate objects which conform to the geo- metric characteristics of the object of interest, hence dramati- cally decreases deployment time in comparison to brute-force approaches which scan the entire image; (b) discards the need for manual training data generation which is laborious, ex- pensive and prone to user bias. An example application is presented where the max-tree and the VGG-16 convolutional neural network architecture are used for the detection of cir- cular tanks on very high resolution satellite imagery.
Article
Full-text available
Max-trees, or component trees, are graph structures that represent the connected components of an image in a hierarchical way. Nowadays, many application fields rely on images with high-dynamic range or floating point values. Efficient sequential algorithms exist to build trees and compute attributes for images of any bit depth. However, we show that the current parallel algorithms perform poorly already with integers at bit depths higher than 16 bits per pixel. We propose a parallel method combining the two worlds of flooding and merging max-tree algorithms. First, a pilot max-tree of a quantized version of the image is built in parallel using a flooding method. Later, this structure is used in a parallel leaf-to-root approach to compute efficiently the final max-tree and to drive the merging of the sub-trees computed by the threads. We present an analysis of the performance both on simulated and actual 2D images and 3D volumes. Execution times are about 20 times better than the fastest sequential algorithm and speed-up goes up to 30 - 40 on 64 threads.
Article
Full-text available
Morphological attribute profiles are multilevel decompositions of images obtained with a sequence of transformations performed by connected operators. They have been extensively employed in performing multi-scale and region-based analysis in a large number of applications. One main, still unresolved, issue is the selection of filter parameters able to provide representative and non-redundant threshold decomposition of the image. This paper presents a framework for the automatic selection of filter thresholds based on Granulometric Characteristic Functions (GCFs). GCFs describe the way that non-linear morphological filters simplify a scene according to a given measure. Since attribute filters rely on a hierarchical representation of an image (e.g., the Tree of Shapes) for their implementation, GCFs can be efficiently computed by taking advantage of the tree representation. Eventually, the study of the GCFs allows the identification of a meaningful set of thresholds. Therefore, a trial and error approach is not necessary for the threshold selection, automating the process and in turn decreasing the computational time. It is shown that the redundant information is reduced within the resulting profiles (a problem of high occurrence, as regards manual selection). The proposed approach is tested on two real remote sensing data sets, and the classification results are compared with strategies present in the literature.
Conference Paper
Full-text available
Dramatic advances in DNA sequencing technology have made it possible to study microbial environments by direct sequencing of environmental DNA samples. Yet, due to the huge volume and high data complexity, current de novo assemblers cannot handle large metagenomic datasets or fail to perform assembly with acceptable quality. This paper presents the first parallel solution for decomposing the metagenomic assembly problem without compromising the post-assembly quality. We transform this problem into that of finding weakly connected components in the de Bruijn graph. We propose a novel distributed memory algorithm to identify the connected subgraphs, and present strategies to minimize the communication volume. We demonstrate the scalability of our algorithm on a soil metagenome dataset with 1.8 billion reads. Our approach achieves a runtime of 22 minutes using 1280 Intel Xeon cores for a 421 GB uncompressed FASTQ dataset. Moreover, our solution is generalizable to finding connected components in arbitrary undirected graphs.
Article
Full-text available
Attribute filters allow enhancement and extraction of features without distorting their borders, and never introduce new image features. In attribute filters, till date setting the attribute-threshold parameters has to be done manually. This research explores novel, simple, fast and automated methods of computing attribute threshold parameters based on image segmentation, thresholding and data clustering techniques in medical image enhancement. A performance analysis of the different methods is carried out using various 3D medical images of different modalities. Though several techniques perform well on these images, the choice of technique appears to depend on the imaging mode.
Article
Full-text available
The topographic map of a gray-level image, also called tree of shapes, provides a high-level hierarchical representation of the image contents. This representation, invariant to contrast changes and to contrast inversion, has been proved very useful to achieve many image processing and pattern recognition tasks. Its definition relies on the total ordering of pixel values, so this representation does not exist for color images, or more generally, multivariate images. Common workarounds, such as marginal processing, or imposing a total order on data, are not satisfactory and yield many problems. This paper presents a method to build a tree-based representation of multivariate images, which features marginally the same properties of the gray-level tree of shapes. Briefly put, we do not impose an arbitrary ordering on values, but we only rely on the inclusion relationship between shapes in the image definition domain. The interest of having a contrast invariant and self-dual representation of multivariate image is illustrated through several applications (filtering, segmentation, and object recognition) on different types of data: color natural images, document images, satellite hyperspectral imaging, multimodal medical imaging, and videos.
Article
Full-text available
The availability of hyperspectral images with improved spectral and spatial resolutions provides the opportunity to obtain accurate land-cover classification. In this paper, a novel methodology that combines spectral and spatial information for supervised hyperspectral image classification is proposed. A feature reduction strategy based on independent component analysis is the main core of the spectral analysis, where the exploitation of prior information coupled to the evaluation of the reconstruction error assures the identification of the best class-informative subset of independent components. Reduced attribute profiles (APs), which are designed to address well-known issues related to information redundancy that affect the common morphological APs, are then employed for the modeling and fusion of the contextual information. Four real hyperspectral data sets, which are characterized by different spectral and spatial resolutions with a variety of scene typologies (urban, agriculture areas), have been used for assessing the accuracy and generalization capabilities of the proposed methodology. The obtained results demonstrate the classification effectiveness of the proposed approach in all different scene typologies, with respect to other state-of-the-art techniques.
Conference Paper
We present a new algorithm for attribute filtering of extremely large images, using a forest of modified max-trees, suitable for distributed memory parallel machines. First, max-trees of tiles of the image are computed, after which messages are exchanged to modify the topology of the trees and update attribute data, such that filtering the modified trees on each tile gives exactly the same results as filtering a regular max-tree of the entire image. On a cluster, a speed-up of up to 53\(\times \) is obtained on 64, and up to 100\(\times \) on 128 single CPU nodes. On a shared memory machine a peak speed-up of 50\(\times \) on 64 cores was obtained.
Article
The tree of shapes is a self-dual tree-based image representation belonging to the field of mathematical morphology. This representation is highly interesting since it is invariant to contrast changes and inversion, and allows for numerous and powerful applications. A new algorithm to compute the tree of shapes has been recently presented: it has a quasilinear complexity; it is the only known algorithm that is also effective for nD images with n > 2; yet it is sequential. With the increasing size of data to process, the need of a parallel algorithm to compute that tree is of prime importance; in this paper, we present such an algorithm. We also give some benchmarks that show that the parallel version is computationally effective. As a consequence, that makes possible to process 3D images with some powerful self-dual morphological tools.