Anthony G. CohnUniversity of Leeds · School of Computing
Anthony G. Cohn
About
367
Publications
70,216
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,253
Citations
Publications
Publications (367)
Qualitative Spatial Reasoning is a well explored area of Knowledge Representation and Reasoning and has multiple applications ranging from Geographical Information Systems to Robotics and Computer Vision. Recently, many claims have been made for the reasoning capabilities of Large Language Models (LLMs). Here, we investigate the extent to which a s...
Large language models (LLMs) are stochastic, and not all models give deterministic answers, even when setting temperature to zero with a fixed random seed. However, few benchmark studies attempt to quantify uncertainty, partly due to the time and cost of repeated experiments. We use benchmarks designed for testing LLMs' capacity to reason about car...
Spatial reasoning plays a vital role in both human cognition and machine intelligence, prompting new research into language models' (LMs) capabilities in this regard. However, existing benchmarks reveal shortcomings in evaluating qualitative spatial reasoning (QSR). These benchmarks typically present oversimplified scenarios or unclear natural lang...
We investigate the abilities of a representative set of Large language Models (LLMs) to reason about cardinal directions (CDs). To do so, we create two datasets: the first, co-created with ChatGPT, focuses largely on recall of world knowledge about CDs; the second is generated from a set of templates, comprehensively testing an LLM's ability to det...
Spatial reasoning plays a vital role in both human cognition and machine intelligence, prompting new research into language models' (LMs) capabilities in this regard. However, existing benchmarks reveal shortcomings in evaluating qualitative spatial reasoning (QSR). These benchmarks typically present oversimplified scenarios or unclear natural lang...
Human sensorimotor decision making has a tendency to get ‘stuck in a rut’, being biased towards selecting a previously implemented action structure (hysteresis). Existing explanations propose this is the consequence of an agent efficiently modifying an existing plan, rather than creating a new plan from scratch. Instead, we propose that hysteresis...
The recent increased interest in structural health monitoring (SHM) related to material performance has necessitated the application of advanced data analysis techniques for interpreting the realtime data in decision-making. Currently, an accurate and efficient approach for the timely analyses of large volumes of uncertain sensor data is not well-e...
Artificial intelligence (AI) has made remarkable progress across various domains, with large language models like ChatGPT gaining substantial attention for their human-like text-generation capabilities. Despite these achievements, improving spatial reasoning remains a significant challenge for these models. Benchmarks like StepGame evaluate AI spat...
RCC*-9 is a mereotopological qualitative spatial calculus for simple lines and regions. RCC*-9 can be easily expressed in other existing models for topological relations and thus can be viewed as a candidate for being a “bridge” model among various approaches. In this paper, we present a revised and extended version of RCC*-9, which can handle non-...
Introduction: Our work introduces a real-time robotic localization and mapping system for buried pipe networks.
Methods: The system integrates non-vision-based exploration and navigation with an active-vision-based localization and topological mapping algorithm. This algorithm is selectively activated at topologically key locations, such as junctio...
Service robots are expected to assist users in a constantly growing range of environments and tasks. People may be unique in many ways, and online adaptation of robots is central to personalized assistance. We focus on collaborative tasks in which the human collaborator may not be fully able-bodied, with the aim for the robot to automatically deter...
Sustainable urban infrastructure planning and maintenance require an integrated approach that considers various infrastructure assets (e.g., the ground, roads, and buried pipes) and their inter-linkages as a holistic system. To facilitate the usage of this integrated approach, we propose a model of city infrastructure assets and their interdependen...
Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects’ availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of...
Language models have become very popular recently and many claims have been made about their abilities, including for commonsense reasoning. Given the increasingly better results of current language models on previous static benchmarks for commonsense reasoning, we explore an alternative dialectical evaluation. The goal of this kind of evaluation i...
Aggregate metrics and lack of access to results limit understanding.
Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects' availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of...
We propose a logic of east and west (LEW ) for points in 1D Euclidean space. It formalises primitive direction relations: east (E), west (W) and indeterminate east/west (Iew). It has a parameter τ ∈ N>1, which is referred to as the level of indeterminacy in directions. For every τ ∈ N>1, we provide a sound and complete axiomatisation of LEW , and p...
Interaction and action anticipation remains a challenging problem, especially considering the generalizability constraints of trained models from visual data or exploiting visual video embeddings. To overcome these constraints, we present an initial investigation of a novel approach for solving the task of interaction anticipation between objects i...
We propose a hierarchical framework for collaborative intelligent systems. This framework organizes research challenges based on the nature of the collaborative activity and the information that must be shared, with each level building on capabilities provided by lower levels. We review research paradigms at each level, with a description of classi...
Implementing human-level reasoning about action effects is an important competence for a cognitive agent: given precondition and action descriptions, a system should be able to infer the change in the physical world that the action causes. In this work, we propose a new action-effect prediction task. We explore few-shot learning with large pre-trai...
Seismic velocity inversion plays a vital role in various applied seismology processes. A series of deep learning methods have been developed which rely purely on manually provided labels for supervision; however, their performances depend heavily on the utilization of large training datasets with corresponding velocity models. Since no physical law...
We propose a hierarchical framework for collaborative intelligent systems. This framework organizes research challenges based on the nature of the collaborative activity and the information that must be shared, with each level building on capabilities provided by lower levels. We review research paradigms at each level, with a description of classi...
In this paper we analyse the issue of reference using spatial language and examine how the polysemy exhibited by spatial prepositions can be incorporated into semantic models for situated dialogue. After providing a brief overview of polysemy in spatial language and a review of related work, we describe an experimental study we used to collect data...
Planning is a computationally expensive process, which can limit the reactivity of autonomous agents. Planning problems are usually solved in isolation, independently of similar, previously solved problems. The depth of search that a planner requires to find a solution, known as the planning horizon, is a critical factor when integrating planners i...
As data sources become ever more numerous with increased feature dimensionality, feature selection for multiview data has become an important technique in machine learning. Semi-supervised multiview feature selection (SMFS) focuses on the problem of how to obtain a discriminative feature subset from heterogeneous feature spaces in the case of abund...
We address the following action-effect prediction task. Given an image depicting an initial state of the world and an action expressed in text, predict an image depicting the state of the world following the action. The prediction should have the same scene context as the input image. We explore the use of the recently proposed GLIDE model for perf...
Location retrieval based on visual information is to retrieve the location of an agent (e.g. human, robot) or the area they see by comparing the observations with a certain form of representation of the environment. Existing methods generally require precise measurement and storage of the observed environment features, which may not always be robus...
Graph neural network(GNN) has obtained outstanding achievements in relational data. However, these data have uncertain properties, for example, spurious edges may be included. Recently, Variational graph autoencoder(VGAE) has been proposed to solve this problem. However, the distributional assumptions in the variational family restrict the variatio...
The current and future capabilities of Artificial Intelligence (AI) are typically assessed with an ever increasing number of benchmarks, competitions, tests and evaluation standards, which are meant to work as AI evaluation instruments (EI). These EIs are not only increasing in number, but also in complexity and diversity, making it hard to underst...
In order to be trusted by humans, Artificial Intelligence agents should be able to describe rationales behind their decisions. One such application is human action recognition in critical or sensitive scenarios, where trustworthy and explainable action recognizers are expected. For example, reliable pedestrian action recognition is essential for se...
Accurate water inflow assessment in the under-construction rock tunnel sites is critical for the next optimized construction and rehabilitation strategy. In this paper, a deep convolutional neural networks (DCNN)-based method, named H-ResNet-34, is implemented to classify water inflow category from rock tunnel faces in under-construction highway tu...
This work offers a defect segmentation approach for the nondestructive testing of tunnel lining internal defects using Ground Penetrating Radar (GPR) data. Given GPR synthetic data, it maps the internal defect structure, using a CNN named Segnet coupled with the Lovász softmax loss function, which enhances the accuracy, automation, and efficiency o...
This paper presents a novel integrated method for interactive characterization of fracture spacing in rock tunnel sections. The main procedure includes four steps: (1) Automatic extraction of fracture traces, (2) digitization of trace maps, (3) disconnection and grouping of traces, and (4) interactive measurement of fracture set spacing, total spac...
Tunnel lining internal defect detection is essential for the safe operation of tunnels. This paper presents an automatic scheme based on rotational region deformable convolutional neural network (R²DCNN) and Ground Penetrating Radar (GPR) images for the accurate detection of defects and rebars with arbitrary orientations. The R²DCNN comprises inter...
Whitby is the server-side of an Intelligent Tutoring System application for learning System-Theoretic Process Analysis (STPA), a methodology used to ensure the safety of anything that can be represented with a systems model. The underlying logic driving the reasoning behind Whitby is Situation Calculus, which is a many-sorted logic with situation,...
In this work, the problem of bootstrapping knowledge in language and vision for autonomous robots is addressed through novel techniques in grammar induction and word grounding to the perceptual world. In particular, we demonstrate a system, called OLAV, which is able, for the first time, to (1) learn to form discrete concepts from sensory data; (2)...
A common approach to interpreting spiking activity is based on identifying the firing fields—regions in physical or configuration spaces that elicit responses of neurons. Common examples include hippocampal place cells that fire at preferred locations in the navigated environment, head direction cells that fire at preferred orientations of the anim...
Scribble-supervised semantic segmentation has gained much attention recently for its promising performance without high-quality annotations. Due to the lack of supervision, confident and consistent predictions are usually hard to obtain. Typically, people handle these problems to either adopt an auxiliary task with the well-labeled dataset or incor...
Chinese traditional music has been proved to be effective in emotion regulation for thousands of years. Five different groups of Chinese traditional music which have been proved can regulate different emotions (Angry, Depressed, Feverish, Desperate, Sorrowful) in the literature. 54 audios features are extracted by using the Librosa library for each...
This paper presents a hybrid ensemble classifier combined synthetic minority oversampling technique (SMOTE), random search (RS) hyper-parameters optimization algorithm and gradient boosting tree (GBT) to achieve efficient and accurate rock trace identification. A thirteen-dimensional database consist- ing of basic, vector, and discontinuity feature...
Whitby is the server-side of an Intelligent Tutoring System application for learning System-Theoretic Process Analysis (STPA), a methodology used to ensure the safety of anything that can be represented with a systems model. The underlying logic driving the reasoning behind Whitby is Situation Calculus, which is a many-sorted logic with situation,...
A common approach to interpreting spiking activity is based on identifying the firing fields---regions in physical or configuration spaces that elicit responses of neurons. Common examples include hippocampal place cells that fire at preferred locations in the navigated environment, head direction cells that fire at preferred orientations of the an...
Scribble-supervised semantic segmentation has gained much attention recently for its promising performance without high-quality annotations. Due to the lack of supervision, confident and consistent predictions are usually hard to obtain. Typically, people handle these problems to either adopt an auxiliary task with the well-labeled dataset or incor...
A DNN architecture referred to as GPRInvNet was proposed to tackle the challenges of mapping the ground-penetrating radar (GPR) B-Scan data to complex permittivity maps of subsurface structures. The GPRInvNet consisted of a trace-to-trace encoder and a decoder. It was specially designed to take into account the characteristics of GPR inversion when...
The ability to compute a quality index for manipulation tasks, in different configurations, has been widely used in robotics. However, it is poorly explored in human manipulation and physical human-robot collaboration (pHRC). Existing works that evaluate efficiency of human manipulation often focus only on heurisitic-based, biomechanics or ergonomi...
A variety of civil engineering applications require the identification of cracks in roads and buildings. In such cases, it is frequently helpful for the precise location of cracks to be identified as labelled parts within an image to facilitate precision repair for example. CrackIT is known as a crack detection algorithm that allows a user to choos...
Various accounts of cognition and semantic representations have highlighted that, for some concepts, different factors may influence category and typicality judgements. In particular, some features may be more salient in categorisation tasks while other features are more salient when assessing typicality. In this paper we explore the extent to whic...
Detection of road pavement cracks is important and needed at an early stage to repair the road and extend its lifetime for maintaining city roads. Cracks are hard to detect from images taken with visible spectrum cameras due to noise and ambiguity with background textures besides the lack of distinct features in cracks.
Hyperspectral images are se...
Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects' availability, learning object affordances in everyday-life scenarios is a challenging task particularly in the presence of...
In this paper, we present a learning-based approach to determining acceptance of arguments under several abstract argumentation semantics. More specifically, we propose an argumentation graph neural network (AGNN) that learns a message-passing algorithm to predict the likelihood of an argument being accepted. The experimental results demonstrate th...
System safety analysis is a creative process that can often be undertaken by people who are not experts in the system under analysis whilst also learning the analysis methodology. With the increase of system complexity, the high demand for analyses conducted at a scale and the potentially catastrophic consequences of inadequate analysis, there is a...
System safety analysis is a creative process that can often be undertaken by people who are not experts in the system under analysis whilst also learning the analysis methodology. With the increase of system complexity, the high demand for analyses conducted at a scale and the potentially catastrophic consequences of inadequate analysis, there is a...
Detecting material changes from a remote distance is very useful for infrastructure condition monitoring. In this work, we show the potential for using hyperspectral imaging to identify the pavement condition and classify roads based on a spectral footprint. Cracks in the road show interior material of different chemical structure from the surface...
We are interested in the problem where a number of robots, in parallel, are trying to solve reaching through clutter problems in a simulated warehouse setting. In such a setting, we investigate the performance increase that can be achieved by using a human-in-the-loop providing guidance to robot planners. These manipulation problems are challenging...
Achieving “commonsense reasoning” capabilities has been one of the goals of AI since its inception. However, as Marcus and Davis have recently argued, “Common sense is not just the hardest problem for AI; in the long run, it’s also the most important problem”. Moreover, it is generally accepted that space (and time) underlie much of what we regard...
In previous work exploring how to automatically generate typicality measures for spatial prepositions in grounded settings, we considered a semantic model based on Prototype Theory and introduced a method for learning its parameters from data. However, though there is much to suggest that spatial prepositions exhibit polysemy, each term was treated...
Human sensorimotor decision-making has a tendency to get ‘stuck in a rut’, being biased towards selecting a previously implemented action structure (‘hysteresis’). Existing explanations cannot provide a principled account of when hysteresis will occur. We propose that hysteresis is an emergent property of a dynamical system learning from the conseq...
The objective of this project is learning high-level manipulation planning skills from humans and transfer these skills to robot planners. We used virtual reality to generate data from human participants whilst they reached for objects on a cluttered table top. From this, we devised a qualitative representation of the task space to abstract human d...
Precise mapping of buried utilities is critical to managing massive urban underground infrastructure and preventing utility incidents. Most current research only focuses on generating such maps based on complete information of underground utilities. However, in real-world practice, it is rare that a full picture of buried utilities can be obtained...
Tunnel maintenance requires complex decision making, which involves pathology diagnosis and risk assessment, to ensure full safety while optimising maintenance and repair costs. A Decision Support System (DSS) can play a key role in this process by supporting the decision makers in identifying pathologies based on disorders present in various tunne...
Urban infrastructure assets (e.g. roads, water pipes) perform critical functions to the health and well-being of society. Although it has been widely recognised that different infrastructure assets are highly interconnected, infrastructure management in practice such as planning, installation and maintenance are often undertaken by different stakeh...
This research proposes a Ground Penetrating Radar (GPR) data processing method for non-destructive detection of tunnel lining internal defects, called defect segmentation. To perform this critical step of automatic tunnel lining detection, the method uses a CNN called Segnet combined with the Lov\'asz softmax loss function to map the internal defec...
Humans, in comparison to robots, are remarkably adept at reaching for objects in cluttered environments. The best existing robot planners are based on random sampling of configuration space -- which becomes excessively high-dimensional with large number of objects. Consequently, most planners often fail to efficiently find object manipulation plans...
Humans, in comparison to robots, are remarkably
adept at reaching for objects in cluttered environments.
The best existing robot planners are based on random sampling
in configuration space- which becomes excessively highdimensional
with a large number of objects. Consequently,
most of these planners suffer from limited object manipulation.
We addr...
In cognitive accounts of concept learning and representation three modelling approaches provide methods for assessing typicality: rule-based, prototype and exemplar models. The prototype and exemplar models both rely on calculating a weighted semantic distance to some central instance or instances. However, it is not often discussed how the central...
We propose a logic of directions for points (LD)over 2D Euclidean space, which formalises primary direction relations east (E), west (W), and indeterminate east/west (Iew), north (N), south (S) and indeterminate north/south (Ins). We provide a sound and complete axiomatisation of it, and prove that its satisfiability problem is NP-complete.
A DNN architecture called GPRInvNet is proposed to tackle the challenge of mapping Ground Penetrating Radar (GPR) B-Scan data to complex permittivity maps of subsurface structure. GPRInvNet consists of a trace-to-trace encoder and a decoder. It is specially designed to take account of the characteristics of GPR inversion when faced with complex GPR...
We address the problem of affordance classification for class-agnostic objects considering an open set of actions, by unsupervised learning of object interactions,inducing object affordance classes. A novel qualitative spatial representation incorporating depth information is used to construct Activity Graphs which encode object interactions. These...
Increased population growth and continued urbanisation will necessitate novel, bold, and revolutionary approaches to infrastructure inspection, maintenance, and repair. This will likely be done by swarms of autonomous robotic systems. The University of Leeds is quickly establishing itself as a leader in the field by taking part in two ambitious inf...