Conference Paper

Fuzzy NLG System for Extensive Verbal Description of Relative Positions

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This tuple is the descriptor. Ancillary information is then deduced from the descriptor and links to a particular relation using a template (Francis et al., 2018). The system can return "A is completely surrounded by B" or "A is somewhat surrounded by B". ...
... However, it does not compute any fuzzy landscape, so a new descriptor has to be generated for each pair of objects. It can interpret other relations, such as RCC8 relations or directional relations (Francis et al., 2018). Another method proposed by (Clément et al., 2017) can be used for assessing surroundedness. ...
Thesis
With the recent successes of deep learning and the growing interactions between humans and AIs, explainability issues have risen. Indeed, it is difficult to understand the behaviour of deep neural networks and thus such opaque models are not suited for high-stake applications. In this thesis, we propose an approach for performing classification or annotation and providing explanations. It is based on a transparent model, whose reasoning is clear, and on interpretable fuzzy relations that enable to express the vagueness of natural language.Instead of learning on training instances that are annotated with relations, we propose to rely on a set of relations that was set beforehand. We present two heuristics that make the process of evaluating relations faster. Then, the most relevant relations can be extracted using a new fuzzy frequent itemset mining algorithm. These relations enable to build rules, for classification, and constraints, for annotation. Since the strengths of our approach are the transparency of the model and the interpretability of the relations, an explanation in natural language can be generated.We present experiments on images and time series that show the genericity of the approach. In particular, the application to explainable organ annotation was received positively by a set of participants that judges the explanations consistent and convincing.
Article
Full-text available
Urban land use extraction from Very High Resolution (VHR) remote sensing images is important in many applications. This study explores a novel way to characterize the spatial arrangement of land cover features, and to integrate it with commonly used land use indicators. Characterization is done based upon building objects, taking their functional properties into account. We categorize the objects to a set of building types according to their geometrical, morphological, and contextual attributes. The spatial arrangement is characterized by quantifying the distribution of building types within a land use unit. Moreover, a set of existing land use indicators primarily based upon the coverage ratio and density of land cover features is investigated. A Bayesian network integrates the spatial arrangement and land use indicators, by which the urban land use is inferred. We applied urban land use extraction to a Pléiades VHR image over the city of Wuhan, China. Our results showed that integrating the spatial arrangement significantly improved the accuracy of urban land use extraction as compared with using land use indicators alone. Moreover, the Bayesian network method produced results comparable to other commonly used classifiers. We concluded that the proposed characterization of spatial arrangement and Bayesian network integration was effective for urban land use extraction from VHR images.
Article
Full-text available
Automatic extraction of building roofs from remote sensing data is important for many applications, including 3D city modeling. This paper proposes a new method for automatic segmentation of raw LIDAR (light detection and ranging) data. Using the ground height from a DEM (digital elevation model), the raw LIDAR points are separated into two groups. The first group contains the ground points that form a "building mask". The second group contains non-ground points that are clustered using the building mask. A cluster of points usually represents an individual building or tree. During segmentation, the planar roof segments are extracted from each cluster of points and refined using rules, such as the coplanarity of points and their locality. Planes on trees are removed using information, such as area and point height difference. Experimental results on nine areas of six different data sets show that the proposed method can successfully remove vegetation and, so, offers a high success rate for building detection (about 90% correctness and completeness) and roof plane extraction (about 80% correctness and completeness), when LIDAR point density is as low as four points/m(2). Thus, the proposed method can be exploited in various applications.
Conference Paper
Full-text available
We propose a system for human-robot interaction that learns both models for spatial prepositions and for object recognition. Our system grounds the meaning of an input sentence in terms of visual percepts coming from the robot's sensors in order to send an appropriate command to the PR2 or respond to spatial queries. To perform this grounding, the system recognizes the objects in the scene, determines which spatial relations hold between those objects, and semantically parses the input sentence. The proposed system uses the visual and spatial information in conjunction with the semantic parse to interpret statements that refer to objects (nouns), their spatial relationships (prepositions), and to execute commands (actions). The semantic parse is inherently compositional, allowing the robot to understand complex commands that refer to multiple objects and relations such as: “Move the cup close to the robot to the area in front of the plate and behind the tea box”. Our system correctly parses 94% of the 210 online test sentences, correctly interprets 91% of the correctly parsed sentences, and correctly executes 89% of the correctly interpreted sentences.
Article
Full-text available
A novel shape descriptor, named as ratio histograms (R-histogram), is proposed to represent the relative attitude relationship between two independent shapes. For a pair of two shapes, the shapes are treated as the longitudinal segments parallel to the line connecting centroids of the two shapes, and the R-histogram is composed of the length ratios of collinear longitudinal segments. R-histogram is theoretically affine invariant due to collinear distance invariance of the affine transformation. In addition, as the computation of the length ratio weakens the noise contribution, R-histogram is robust to noise. Based on the R-histogram, the shape-matching algorithm includes two major phases: preprocessing and matching. The first phase, which can be processed off-line, is trying to obtain the R-histograms of all original shape pairs. In the second phase, for each transformed shape pair, its R-histogram is computed and the candidate matched shape pair with minimal R-histogram matching error is found. Subsequently, a voting strategy, which further improves the accuracy of shape matching, is adopted for the candidate corresponding shape pairs. Experimental results demonstrate that the proposed R-histogram is robust and efficient.
Conference Paper
Full-text available
This paper presents a novel approach called Spread Histogram for calculation of spatial relations between objects. It allows to determine such relations as INSIDE, OUTSIDE, ENCOMPASS. Additionally, the method cooperates very well with standard histogram methods like Histogram of Angles for determining the directional spatial relations.
Conference Paper
Full-text available
We describe an interval logic for reasoningabout space. The logic simplifies an earliertheory developed by Randell and Cohn, andthat of Clarke upon which the former wasbased. The theory supports a simpler ontology,has fewer defined functions and relations,yet does not suffer in terms of its usefulexpressiveness. An axiomatisation of the newtheory and a comparison with the two originaltheories is given.1 IntroductionThe use of interval logics for the representation oftime...
Conference Paper
Full-text available
In this paper, we present a novel unifying concept of pairwise spatial relations. We develop two way directional relations with respect to a unique point set, based on topology of the studied objects and thus avoids problems related to erroneous choices of reference objects while preserving symmetry. The method is robust to any type of image configuration since the directional relations are topologically guided. An automatic prototype graphical symbol retrieval is presented in order to establish its expressiveness.
Article
Full-text available
In conversation, people often use spatial relationships to describe their environment, e.g., "There is a desk in front of me and a doorway behind it," and to issue directives, e.g., "go around the desk and through the doorway." In our research, we have been investigating the use of spatial relationships to establish a natural communication mechanism between people and robots, in particular, for novice users. In this paper, the work on robot spatial relationships is combined with a multimodal robot interface. We show how linguistic spatial descriptions and other spatial information can be extracted from an evidence grid map and how this information can be used in a natural, human-robot dialog. Examples using spatial language are included for both robot-to-human feedback and also human-to-robot commands. We also discuss some linguistic consequences in the semantic representations of spatial and locative information based on this work.
Article
Full-text available
Searching for relevant knowledge across heterogeneous geospatial databases requires an extensive knowledge of the semantic meaning of images, a keen eye for visual patterns, and efficient strategies for collecting and analyzing data with minimal human intervention. In this paper, we present our recently developed content-based multimodal Geospatial Information Retrieval and Indexing System (GeoIRIS) which includes automatic feature extraction, visual content mining from large-scale image databases, and high-dimensional database indexing for fast retrieval. Using these underpinnings, we have developed techniques for complex queries that merge information from heterogeneous geospatial databases, retrievals of objects based on shape and visual characteristics, analysis of multiobject relationships for the retrieval of objects in specific spatial configurations, and semantic models to link low-level image features with high-level visual descriptors. GeoIRIS brings this diverse set of technologies together into a coherent system with an aim of allowing image analysts to more rapidly identify relevant imagery. GeoIRIS is able to answer analysts' questions in seconds, such as "given a query image, show me database satellite images that have similar objects and spatial relationship that are within a certain radius of a landmark."
Article
In this paper, we introduce a system able to generate an intuitive, human-like linguistic decryption of the spatial relations between two objects in observer reference framework. The description includes distance relations terms, directional relations terms, topological relations terms, and approximate terms commonly found in daily communications. Each description is built based on fuzzy rules and information extracted from V-Histograms. Excellent results show that the method of linguistic descriptions is feasible, with the advantages of acknowledgement and higher accurateness, and can be aoolied in vehicle driver assistance and robot vision etc.
Article
We have described a system for reasoning about temporal intervals that is both expressive and computationally effective. The representation captures the temporal hierarchy implicit in many domains by using a hierarchy of reference intervals, which precisely control the amount of deduction performed automatically by the system. This approach is partially partially useful in domains where temporal information is imprecise and relative, and techniques such as dating are not possible. © 1990 Morgan Kaufmann Publishers, Inc. Published by Elsevier Inc. All rights reserved.
Article
Spatial prepositions, like above, inside, near, denote spatial relationships. A relative position descriptor is a basis from which quantitative models of spatial relationships can be derived. It is an image descriptor, like colour, texture, and shape descriptors. Various relative position descriptors can be found in the literature. In this paper, we introduce a new relative position descriptor-the O-descriptor-that has about all the strengths of each and every one of its competitors, and none of the weaknesses. Our approach is based on the concept of the F-histogram and on an original categorization of pairs of consecutive boundary points on a line.
Article
In this paper, we present an approach for modeling and comparing small sets of 2-D objects based on their spatial relationships. This situation can arise in the conflation of a hand- or machine-drafted map to a satellite image, or in the correspondence problem of matching two images taken under different viewing conditions. We focus on the specific problem of matching a sketched map containing several 2-D objects to hand-segmented satellite imagery. We define a similarity measure between the spatial configurations of two object sets, which uses attributed relational graphs to represent scene information. Objects are represented as graph nodes and edges are defined by the histograms of forces between object pairs. We develop a memetic algorithm based on a (μ+λ) evolution strategy to solve this scene-matching problem with three domain-specific local search operators that are compared experimentally.
Article
This paper describes research that seeks to supersede human inductive learning and reasoning in high-level scene understanding and content extraction. Searching for relevant knowledge with a semantic meaning consists mostly in visual human inspection of the data, regardless of the application. The method presented in this paper is an innovation in the field of information retrieval. It aims to discover latent semantic classes containing pairs of objects characterized by a certain spatial positioning. A hierarchical structure is recommended for the image content. This approach is based on a method initially developed for topics discovery in text, applied this time to invariant descriptors of image region or objects configurations. First, invariant spatial signatures are computed for pairs of objects, based on a measure of their interaction, as attributes for describing spatial arrangements inside the scene. Spatial visual words are then defined through a simple classification, extracting new patterns of similar object configurations. Further, the scene is modeled according to these new patterns (spatial visual words) using the latent Dirichlet allocation model into a finite mixture over an underlying set of topics. In the end, some statistics are done to achieve a better understanding of the spatial distributions inside the discovered semantic classes.
Article
The importance of topological and directional relationships between spatial objects has been stressed in different fields, notably in Geographic Information Systems (GIS). In an earlier work, we introduced the notion of the F-histogram, a generic quantitative representation of the relative position between two 2D objects, and showed that it can be of great use in understanding the spatial organization of regions in images. Here, we illustrate that the F-histogram constitutes a valuable tool for extracting directional and topological relationship information. The considered objects are not necessarily convex and their geometry is not approximated through, e.g., Minimum Bounding Rectangles (MBRs). The F-histograms introduced in this chapter are coupled with Allen’s temporal relationships based on fuzzy set theory. Allen’s relationships are commonly extended into the spatial domain for GIS purposes, and fuzzy set theoretic approaches are widely used to handle imprecision and achieve robustness in spatial analysis. For any direction in the plane, the F-histograms define a fuzzy 13-partition of the set of all object pairs, and each class of the partition corresponds to an Allen relation. Lots of directional and topological relationship information as well as different levels of refinements can be easily obtained from this approach, in a computationally tractable way. Keywords. F-histograms, Allen relations, spatial relations, spatial analysis, Geographic
Article
This paper presents semantic risk estimation of suspected minefields using spatial relationships of minefield indicators extracted from multi-level remote sensing. Both satellite image and pyramidal airborne acquisitions from 900m to 30m flying heights with resolutions from 1m to 2cm resolutions are used for identification of minefield indicators. R-Histogram [1] is a quantitative representation of spatial relationship between two objects in an image. Eight spatial relationships can be generated: 1) LEFT OF, 2) RIGHT OF, 3) ABOVE, 4) BELOW, 5) NEAR, 6) FAR, 7) INSIDE, 8) OUTSIDE. R-Histogram semantics are first generated from selected indicators and metrics such as topological proximity and directional relationships are trained for soft classification of risk index (normalized as 0-1). We presented a framework of how semantic metadata generated from remote sensing images are used in risk estimation. The resultant risk index identified seven out of twelve mine accidents occurred at high risk region. More importantly, comparison with ground truth obtained after mine clearance show that three out of the four identified pattern minefields falls into the area estimated at very high risk. A parcel-based per-field risk estimation can also be easily generated to show the usefulness of the risk index.
Article
In this paper, we show how linguistic expressions can be generated to describe the spatial relations between a mobile robot and its environment, using readings from a ring of sonar sensors. Our work is motivated by the study of human-robot communication for novice robot users. The ultimate goal is to exploit these linguistic expressions for navigation of the mobile robot in an unknown environment, where the expressions represent the qualitative state of the robot in terms that are easily understood by humans. The notion of the histogram of forces was presented in previous work and used to generate linguistic descriptions of relative positions in digital images. Here, we demonstrate that it also permits fast processing of vector data and can be applied to a robot with range sensors moving in a dynamic environment. We introduce a new method for detecting partially and completely surrounded conditions, and we show that detailed descriptions can be obtained as well as coarse ones. Numerous examples are included, illustrating a variety of situations.
Article
In this paper, we introduce a system able to generate an intuitive, human-like linguistic description of the topological relationships between two objects. The description includes approximate terms commonly found in daily communications. It attempts to capture the essential characteristics of the relationships, while leaving out superfluous and possibly overwhelming detail. The objects are 2-D image objects. They need not be convex, nor connected, and they may have holes in them. Each description is built around Allen relations, based on information extracted from F-histograms. It consists of a topological component that indicates the primary topological relationships, directional estimates of where these relationships are most prominent, and a self-assessment component which reflects the complexity of the situation. Experiments on synthetic and real data validate the approach.
Article
This paper presents a system based on neural networks that can analyse spatial relations in a visual scene and connect them to appropriate linguistic descriptions. The system learns spatial concepts like "right of" and "above" by viewing a visual scene containing a number of objects and simultaneously receiving a text string describing the scene. The spatial relations between the objects in the scene are analysed with the aid of saccadic shifts of the focus attention. The system thus learns to correlate linguistic expressions for spatial relation with different kinds of saccades. After being trained, the system can correctly describe previously viewed scenes.
Article
Object recognition and scene analysis tasks can be greatly enhanced when information about spatial organization in an image is available. Moreover, for recognition of complex objects a suitable representation of spatial relations between objects' components taking into account shape, size, orientation, etc., is required. This cannot be accomplished by reducing a region to one or a few representative points; instead the region as a whole must be treated.This paper presents a fuzzy logic approach to the representation and recognition of spatial relations between regions in a 2D image. The main source of information on spatial relations is the geometry of the regions in question and we argue that this is complex enough to cause ambiguity in spatial relations, and hence to warrant a fuzzy logic approach. The basic idea is to calculate the angles between the line connecting two points (one in each region) and the horizontal line, to construct a histogram of these angles, and then upon an interpretation of the histogram as a fuzzy set to match it with the fuzzy sets representing a vocabulary of spatial relations. Other expressions of the spatial information which may be context dependent can be easily obtained by adding context knowledge. Several examples are used to illustrate our approach.
Article
Automated scene description is a very important part of, but also a very hard task in, high-level computer vision. Today, scene description is being applied in real problems such as robotic navigation. In this paper, we present a fuzzy rule-based approach that accomplishes the task of automated linguistic scene description. Membership functions for spatial relations between different components in the scene are evaluated, guided by a spatial relationship matrix that describes typical scene entities for which these relative locations are desirable. A fuzzy logic rule-based system is developed to combine the spatial relationships and other important scene properties to generate the final linguistic interpretation. Excellent results from several image examples of different types show the applicability of this approach.
Conference Paper
Representation of relative spatial relations between objects is often required in many multimedia database applications because spatial relations between objects in an image convey important information about the image. Quantitative representation of spatial relations taking into account shape, size, orientation and distance is often required. The R-Histogram is such a quantitative representation of spatial relations between two objects. However, this method only considers pixels on the object boundary, assuming that the objects are homeomorphic to a 2-ball. For objects with more complicated topology, we propose in this paper the R*-Histogram, a new extension to the R-Histogram. The R*-Histogram generalizes the R-Histogram by taking into account all the pixels in the objects. We also introduce an efficient O(kN log N) time algorithm to compute the R*-Histogram, which is asymptotically faster than the original O(N2) time algorithm for the R-Histogram even when k=O(n). Here, N=n2 denotes the number of pixels in the processed n x n image and k is the number of different directions considered. The effectiveness of the R*-Histogram is evaluated empirically with a Query By Example (QBE) system on a database of 2000 synthetic images containing objects with complicated shape and topology. Experiments have shown that the similarly search results match human intuition very well.
Conference Paper
The spatial relations representation modeling is a basic task of computer graphics comprehension, however, previous modeling is usually aimed at a certain kind of relations such as direction, topology and distance. The models are relatively independent and inconsistent with human's cognitive logic, thus building a unified spatial relations representation modeling based on different reference frames is required. In this paper we first discuss the importance of a reference frame for building a spatial relations representation modeling, then introduce the feasibility of histogram modeling for building a unified spatial relations representation modeling, and then describe the construction of spatial relations representation modeling based on visual area histogram under the deictic reference frame, in order to testify the correctness of the modeling, two typical examples are given, finally point out the advantages of this modeling, as well as the future work.
Article
Definitions of the degree of adjacency of two regions in the plane, and the degree of surroundedness of one region by another, are proposed. Some elementary properties of these concepts are established and it is also shown that they have natural generalizations to fuzzy subsets of the plane. Applications of the proposed measures to digital polygons are demonstrated and fast algorithms for computing these measures are given.
Article
Natural language descriptions are an important,step in bridging,the gap between ,numerical ,representations of spatial data and the human ,user. Inthis work, we present a system for generating linguistic descriptions ofthe,spatial relationships between two-dimensional, objects. ,The most pertinent relations for the description are chosen based on a fuzzification of the set relations DISJOINT, OVERLAP, SUBSET, SUBSETi and EQUAL. Ahandful,of relevant ,Allen relations is then ,selected and ,their Allen F-histograms are analysed ,to extract ,further topological and ,directional information. The approach,is validated using several sets of real and synthetic data. Keywords: spatial relationships; topological relations; Allen relations; set
Article
This paper outlines further advances from initial findings previously reported in [O. Sjahputera, J.M. Keller, Possibilistic C-means in scene matching, Fourth Internat. Conf. of the European Society for Fuzzy Logic and Technology (EUSFLAT), 2005, pp. 669–675]. We propose a scene matching approach based on spatial relationships among objects in the images to determine if two images acquired under different viewing conditions capture the same scene. This is a difficult problem in computer vision. Our approach produces a mapping of objects from one view to the other, and recovers the viewing transformation parameters. The core of the system relies on capturing spatial relationship information through Force Histograms as affine-invariant image descriptors. Object mapping across images is performed by finding the best correspondence map (FMAP) between force histograms in the two images. The major problem is that the number of potential FMAPs is large, even for modest numbers of scene objects. Hence, search optimization is required. The correct FMAP contains histogram correspondences represented by similar feature vectors. Therefore, dense regions in the feature space are suspected to contain these vectors. Possibilistic C-means (PCM) clustering is used to find these dense regions. The centroids of these dense regions are used to generate the FMAPs. Previously, the FMAP was generated using a nearest-neighbor like approach. In this study, we propose an improved version of this method by incorporating fuzzy memberships into the FMAP building process. Here, the fitness of FMAP candidates are assessed with respect to all histogram correspondences already in FMAP, not just from an initial seed point alone. The best FMAP is selected and translated into a mapping scheme that connects the objects in the two images.
Article
Fuzzy set methods have been used to model and manage uncertainty in various aspects of image processing, pattern recognition, and computer vision. High-level computer vision applications hold a great potential for fuzzy set theory because of its links to natural language. Linguistic scene description, a language-based interpretation of regions and their relationships, is one such application that is starting to bear the fruits of fuzzy set theoretic involvement. In this paper, we are expanding on two earlier endeavors. We introduce new families of fuzzy directional relations that rely on the computation of histograms of forces. These families preserve important relative position properties. They provide inputs to a fuzzy rule base that produces logical linguistic descriptions along with assessments as to the validity of the descriptions. Each linguistic output uses hedges from a dictionary of about 30 adverbs and other terms that can be tailored to individual users. Excellent results from several synthetic and real image examples show the applicability of this approach.
Conference Paper
This work presents a conceptual framework for representing, manipulating, measuring, and communicating in natural language several ideas about topological (non-metric) spatial locations, object spatial contexts, and user expectations of spatial relationships. It articulates a theory of spatial relations, how they can be represented as fuzzy predicates internally, and how they can be appropriately derived from, imagery; then, how they can be augmented or filtered using prior knowledge, and lastly, how they can produce natural language statements about location and space. This framework quantifies the notions of context and vagueness, so that all spatial relations are measurably accurate, provably efficient, and matched to users' expectations. The work makes explicit two critical heuristics for reducing the complexity of the relationships implicit in imagery, one a general rule for single object descriptions, and the other a general rule for rank ordering object relationships. A derived working system combines variable aspects of computer science and linguistics in such a way so as to be extensible to many environments. The system has been demonstrated both in, a landmark navigation task and in a medical task, two very separate domains, and has been evaluated in both
Article
Properties of objects and spatial relations between objects play an important role in rule-based approaches for high-level vision. The partial presence or absence of such properties and relationships can supply both positive and negative evidence for region labeling hypotheses. Similarly, fuzzy labeling of a region can generate new hypotheses pertaining to the properties of the region, its relation to the neighboring regions, and, finally, hypotheses pertaining to the labels of the neighboring regions. A unified methodology that can be used to characterize both properties and spatial relationships of object regions in a digital image is presented. The methods proposed for computing the properties and relations of image regions can be used to arrive at more meaningful decisions about the contents of the scene
Article
The fuzzy qualitative evaluation of directional spatial relationships (such as “to the right of”, “to the south of...”) between areal objects often relies on the computation of a histogram of angles, which is considered to provide a good representation of the relative position of an object with regard to another. In this paper, the notion of the histogram of forces is introduced. It generalizes and may supersede the histogram of angles. The objects (2D entities) are handled as longitudinal sections (1D entities), not as points (OD entities). It is thus possible to fully benefit from the power of integral calculus and, so, ensure rapid processing of raster data, as well as of vector data, explicitly considering both angular and metric information
Article
This thesis describes a connectionist model which learns to perceive spatial events and relations in simple movies of 2-dimensional objects, so as to name the events and relations as a speaker of a particular natural language would. Thus, the model learns perceptually grounded semantics for natural language spatial terms. Natural languages differ -- sometimes dramatically -- in the ways in which they structure space. The aim here has been to have the model be able to perform this learning task for terms from any natural language, and to have learning take place in the absence of explicit negative evidence, in order to rule out ad hoc solutions and to approximate the conditions under which children learn. The central focus of this thesis is a connectionist system which has succeeded in learning spatial terms from a number of different languages. The design and construction of this system have resulted in several technical contributions. The first is a very simple but effective means of ...
3D Scene Description and Construction Using Spatial Referencing Language
  • S Blisard
Models of Spatial Relationships Based on the Φ-Descriptor
  • P Matsakis
  • M Naeem
  • J Francis