[Show abstract][Hide abstract] ABSTRACT: Multi-object tracking is a difficult problem, but in recent years, particle filter-based object trackers (or Sequential Monte Carlo Estimation based trackers) have proven to be very effective. This method consists of a dynamic model for prediction and an observation model to evaluate the likelihood of a predicted state (Doucet et al. 2001). In other words, it recursively estimates the time-evolving posterior distribution of the target locations (and/or velocity) conditioned on all observations seen so far. In our experimental context and for the study of bedload transport, we based our tracking algorithm on particle fillters and detector confidence to track beads of two different classes and sizes. With this work, we want to have a better precision in the bead positions and to deal with false positives (wrong detections) and false negatives (missing detections).
GdR ISIS - Suivi multi-objets dans les séquences vidéo complexes, Télécom ParisTech; 10/2015
[Show abstract][Hide abstract] ABSTRACT: Bedload transport is the sediment load transported in contact with the bed of a river channel. Empirical relations are often used to predict sediment transport rates. However, as rivers are highly variable (e.g. geometries, flow rates), these predictions usually lack accuracy. Consequently, a more detailed comprehension of the physical processes is required. More precisely, we require an improved understanding on how fine sediment inputs to river channels influence sediment transport, channel stability, ecology and stratigraphy. In this context, image processing is a promising technique providing a highly-detailed insight into channel behavior over time. Moreover, it allows the development, and experimental validation, of new models. In order to undertake this, it is necessary to measure important physical quantities such as water depth, slope and their evolution with time. In this work, we propose a new image analysis chain dedicated to the experimental investigation of bedload transport in a laboratory flume.
First we describe the experimental setup which consists of a narrow (10.3 mm wide), glass-sided, steel-framed channel inclined at 10.1 %. The channel is equipped with a water supply, and two sediment feeders which input bimodal mixtures of spherical glass beads under varying conditions, particularly using varying grain size ratios. Each experiment is recorded using a high-speed camera, at 130 frames per second with an image resolution of 1024x500 pixels. Using this setup, the entrainment and transport of particles can be studied at the particle-scale.
Then we present the main contribution of this paper, an image processing method that is able to automatically detect the water free surface and the bed elevation for a sequence of images. This method is based on a combination of morphological operations such as erosion, dilation, closing or watershed.
Finally, an experimental evaluation is presented which demonstrates the ability of our set up and processing chain to track channel slope and water depth evolution over time and make comparisons for varying conditions, therefore elucidating the fundamental principles controlling these processes.
10th Pacific Symposium on Flow Visualization and Image Processing 2015, Naples, Italy; 06/2015
[Show abstract][Hide abstract] ABSTRACT: Bedload, the part of sediment transport remaining in contact with the bed has been mainly investigated from a fluid perspective. Bedload should also be considered from a granular point of view, and take into account the grain-grain interactions. This paper focuses on particle tracking velocimetry algorithms to better understand bedload transport at the particle scale. Two-size mixtures of spherical glass beads entrained by a shallow turbulent and supercritical water flow were analysed in a quasi-two-dimensional 10 % steep channel with a mobile bed. The coarse particle diameters were 6 or 5 mm and the finer diameters ranged from about 0.7 to 4 mm. Water flow and sediment rates were kept constant at the inlet. After obtaining bed load equilibrium for the coarser particles only, that is, neither bed degradation nor aggradation over sufficiently long time intervals, and a bed slope parallel to the flume slope, finer sediment was introduced. The evolution towards a new equilibrium state was recorded through video acquisition from the side by a high-speed camera. Particle tracking algorithms made it possible to determine the position, velocity and trajectory of a very large number of both coarse and sometimes fine particles over the depth of the bedload layer. This paper will present in detail the algorithms used for detecting and tracking the glass beads, before analysing results on particle velocity distributions and depth profiles as well as results on concentrations and sediment rates.
European Geosciences Union General Assembly 2015, Vienna, Austria; 04/2015
[Show abstract][Hide abstract] ABSTRACT: In the context of category level scene classification, the bag-of-visual-words model (BoVW) is widely used for image representation. This model is appearance based and does not contain any information regarding the arrangement of the visual words in the 2D image space. To overcome this problem, recent approaches try to capture information about either the absolute or the relative spatial location of visual words. In the first category, the so-called Spatial Pyramid Representation (SPR) is very popular thanks to its simplicity and good results. Alternatively, adding information about occurrences of relative spatial configurations of visual words was proven to be effective but at the cost of higher computational complexity, specifically when relative distance and angles are taken into account. In this paper, we introduce a novel way to incorporate both distance and angle information in the BoVW representation. The novelty is first to provide a computationally efficient representation adding relative spatial information between visual words and second to use a soft pairwise voting scheme based on the distance in the descriptor space. Experiments on challenging data sets MSRC-2, 15Scene, Caltech101, Caltech256 and Pascal VOC 2007 demonstrate that our method outperforms or is competitive with concurrent ones. We also show that it provides important complementary information to the spatial pyramid matching and can improve the overall performance.
[Show abstract][Hide abstract] ABSTRACT: With multimedia information retrieval, combining different modalities - text, image, audio or video provides additional information and generally improves the overall system performance. For this purpose, the linear combination method is presented as simple, flexible and effective. However, it requires to choose the weight assigned to each modality. This issue is still an open problem and is addressed in this paper. Our approach, based on Fisher Linear Discriminant Analysis, aims to learn these weights for multimedia documents composed of text and images. Text and images are both represented with the classical bag-of-words model. Our method was tested over the ImageCLEF datasets 2008 and 2009. Results demonstrate that our combination approach not only outperforms the use of the single textual modality but provides a nearly optimal learning of the weights with an efficient computation. Moreover, it is pointed out that the method allows to combine more than two modalities without increasing the complexity and thus the computing time.
[Show abstract][Hide abstract] ABSTRACT: Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.
Computer Vision and Pattern Recognition (CVPR); 06/2013
[Show abstract][Hide abstract] ABSTRACT: This paper presents a novel approach to incorporate spatial information in the bag-ofvisual- words model for category level and scene classification. In the traditional bag-ofvisual- words model, feature vectors are histograms of visual words. This representation is appearance based and does not contain any information regarding the arrangement of the visual words in the 2D image space. In this framework, we present a simple and efficient way to infuse spatial information. Particularly, we are interested in explicit global relationships among the spatial positions of visual words. Therefore, we take advantage of the orientation of the segments formed by Pairs of Identical visual Words (PIW). An evenly distributed normalized histogram of angles of PIW is computed. Histograms produced by each word type constitute a powerful description of intra type visual words relationships. Experiments on challenging datasets demonstrate that our method is competitive with the concurrent ones. We also show that, our method provides important complementary information to the spatial pyramid matching and can improve the overall performance.
[Show abstract][Hide abstract] ABSTRACT: Résumé. Avec le développement du numérique, des quantités très importantes de documents composés de texte et d'images sont échangés, ce qui nécessite le développement de modèles permettant d'exploiter efficacement ces informations multimédias. Dans le contexte de la recherche d'information, un modèle possible consiste à représenter séparément les informations textuelles et visuelles et à combiner linéairement les scores issus de chaque représentation. Cette approche nécessite le paramétrage de poids afin d'équilibrer la contribution de chaque modalité. Le but de cet article est de présenter une nouvelle méthode permet-tant d'apprendre ces poids, basée sur l'analyse linéaire discriminante de Fisher (ALD). Des expérimentations réalisées sur la collection ImageCLEF montrent que l'apprentissage des poids grâce à l'ALD est pertinent et que la combinaison des scores correspondante améliore significativement les résultats par rapport à l'utilisation d'une seule modalité.
[Show abstract][Hide abstract] ABSTRACT: Bedload sediment transport of two-size coarse spherical particle mixtures in a turbulent supercritical flow was analyzed with image and particle tracking velocimetry algorithms in a two-dimensional flume. The image processing procedure is entirely presented. Experimental results, including the size, the position, the trajectory, the state of movement (rest, rolling, and saltation), and the neighborhood configuration of each bead, were compared with a previous one-size experiment. Analysis of the solid discharge along the vertical displayed only one peak of rolling in the two-size bed, whereas three peaks of rolling appeared in the one-size case due to a larger collective motion. The same contrast is evidenced in spatio-temporal diagrams where the two-size mixtures are characterized by the predominance of saltation and a smaller number of transitions between rest and rolling. The segregation of fine particles in a bed formed by larger particles was analyzed taking into account the neighborhood configurations.
Experiments in Fluids 11/2010; 49(5). DOI:10.1007/s00348-010-0856-6 · 1.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper focuses on non-linear pattern matching transforms based on mathematical morphology for gray level image processing. Our contribution is on two fronts. First, we unify the existing and a priori unconnected approaches to this problem by establishing their theoretical links with topology. Setting them within the same context allows to highlight their differences and similarities, and to derive new variants. Second, we develop the concept of virtual double-sided image probing (VDIP), a broad framework for non-linear pattern matching in grayscale images. VDIP extends our work on the multiple object matching using probing (MOMP) transform we previously defined to locate multiple grayscale patterns simultaneously. We show that available methods as well as the topological approach can be generalized within the VDIP framework. They can be formulated as particular variants of a general transform designed for virtual probing. Furthermore, a morphological metric, called SVDIP (single VDIP), is deduced from the VDIP concept. Some results are presented and compared with those obtained with classical methods.
[Show abstract][Hide abstract] ABSTRACT: Image representation using bag of visual words approach is commonly used in image classification. Features are extracted from images and clustered into a visual vocabulary. Images can then be represented as a normalized histogram of visual words similarly to textual documents represented as a weighted vector of terms. As a result, text categorization techniques are applicable to image classification. In this paper, our contribution is twofold. First, we propose a suitable Term-Frequency and Inverse Document Frequency weighting scheme to characterize the importance of visual words. Second, we present a method to fuse different bag-of-words obtained with different vocabularies. We show that using our tf.idf normalization and the fusion leads to better classification rates than other normalization methods, other fusion schemes or other approaches evaluated on the SIMPLIcity collection.
Content-Based Multimedia Indexing (CBMI), 2010 International Workshop on; 07/2010
[Show abstract][Hide abstract] ABSTRACT: Aluminum sheet is currently used for body panels on a number of mass-produced vehicles, in particular for closure panels. AA5xxx alloys always contain coarse inter-metallic particles (Al(x)(Fe,Mn)(y)Si, Mg(2)Si) after casting. In the present work inter-metallic particle break-up during hot reversible rolling of AA5182 alloy sheets has been analyzed. The sizes and shapes of inter-metallic particles in as-cast and industrially hot rolled AA5182 alloys sheets were characterized by 3D X-ray tomography observations. The relation between particle break-up and particle morphology was then analyzed statistically and by a micromechanical finite element (FE)-based model. The essential outcomes of the statistical approach may be summarized as follows. The inter-metallic particle population may be described by five morphological parameters. Secondly the comparison of the particle morphology in as cast and industrially rolled sheets leads to the definition of five classes. The evolution of each particle class as function of the rolling strain is provided. The statistical analysis shows which particles break-up. The stresses and strains in inter-metallic particles, embedded in an elasto-viscoplastic aluminum matrix submitted to plane strain compression, were analyzed by an FE model. A new failure criterion was proposed. The essential outcomes of the mechanical approach are as follows: a precise description of stress concentration mechanisms in nonconvex particles, a close description of the parameters controlling particle break-tip, and finally a simplified classification of the failure behavior.
[Show abstract][Hide abstract] ABSTRACT: As an alternative to vector representations, a recent trend in image classification suggests to integrate additional structural information in the description of images in order to enhance classification accuracy. Rather than being represented in a p-dimensional space, images can typically be encoded in the form of strings, trees or graphs and are usually compared either by computing suited metrics such as the (string or tree)-edit distance, or by testing subgraph isomorphism. In this paper, we propose a new way for representing images in the form of strings whose symbols are weighted according to a TF-IDF-based weighting scheme, inspired from information retrieval. To be able to handle such real-valued weights, we first introduce a new weighted string edit distance that keeps the properties of a distance. In particular, we prove that the triangle inequality is preserved which allows the computation of the edit distance in quadratic time by dynamic programming. We show on an image classification task that our new weighted edit distance not only significantly outperforms the standard edit distance but also seems very competitive in comparison with standard histogram distances-based approaches.
[Show abstract][Hide abstract] ABSTRACT: L'holographie numérique en ligne est une technique prometteuse dans le domaine de la visualisation quantitative des écoulements. Elle permet notamment de mesurer et de positionner en 3D des petits objets à partir de l'acquisition d'une seule image et avec un montage expérimental très simple. Le traitement numérique des images-hologramme correspondantes est un domaine beaucoup étudié actuellement. L'objet de cette communication est de proposer une amélioration d'un algorithme de dépouillement proposé par notre équipe. En effet, cet algorithme permet notamment d'augmenter la taille du champ et la précision sur l'estimation des paramètres des particules, mais il présente l'inconvénient d'un temps de calcul élevé. Après avoir rappelé le principe de l'algorithme initial, nous présentons des modifications et nous évaluons leur impact sur les performances à partir d'hologrammes synthétiques. Cette étude permet de conclure que la nouvelle version de l'algorithme permet de gagner un facteur allant de 2,5 à 4,9 sur le temps de calcul.
[Show abstract][Hide abstract] ABSTRACT: This paper reports our multimedia information retrieval experiments carried out for the ImageCLEF track 2009. In 2008, we proposed a multimedia document model defined as a vector of textual and visual terms weighted using a tf.idf approch . For our second participation, our goal was to improve this previous model in the following ways: 1) use of additional information for the textual part (legend and image bounding text extracted from the original documents, 2) use of different image detectors and descriptors, 3) new text / image combination approach. Results allow to evaluate the benefits of these different improvements.
[Show abstract][Hide abstract] ABSTRACT: Nous présentons dans cet article un modèle de représentation de documents multimédia combinant des informations textuelles et des descripteurs visuels. Le texte et l'image composant un document sont chacun décrits par un vecteur de poids $tf.idf$ en suivant une approche "sac-de-mots". Le modèle utilisé permet d'effectuer des requêtes multimédia pour la recherche d'information. Notre méthode est évaluée sur la base imageCLEF'08 pour laquelle nous possédons la vérité de terrain. Plusieurs expérimentations ont ét\é menées avec différents descripteurs et plusieurs combinaisons de modalités. L'analyse des résultats montre qu'un modèle de document multimédia permet d'augmenter les performances d'un système de recherche basé uniquement sur une seule modalité, qu'elle soit textuelle ou visuelle.