Jakob Suchan

Jakob Suchan
Constructor University Bremen gGmbH · Computer Science and Electrical Engineering

Dr.

About

56
Publications
12,704
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
368
Citations
Introduction
My research lies in the intersection of AI, Computer Vision, and Human‐Centred Computing. In particular, I am interested in how we can develop general methods and tools that enable autonomous systems to abstract, understand, reason about, and learn from multimodal (human) Interactions, with the aim to assist humans in their everyday personal and professional tasks.

Publications

Publications (56)
Conference Paper
Full-text available
We present a computational framework for the grounding and semantic interpretation of dynamic visuo-spatial imagery consisting of video and eye-tracking data. Driven by cognitive film studies and visual perception research, we demonstrate key technological capabilities aimed at investigating attention & recipient effects vis-a-vis the motion pictur...
Conference Paper
Full-text available
We present a system for generating and understanding of dynamic and static spatial relations in robotic interaction setups. Robots describe an environment of moving blocks using English phrases that include spatial relations such as "across" and "in front of". We evaluate the system in robot-robot interactions and show that the system can robustly...
Conference Paper
Full-text available
We propose a hybrid architecture for systematically computing robust visual explanation(s) encompassing hypothesis formation, belief revision, and default reasoning with video data. The architecture consists of two tightly integrated synergistic components: (1) (functional) answer set programming based abductive reasoning with SPACE-TIME TRACKLETS...
Conference Paper
Full-text available
We demonstrate the need and potential of systematically integrated vision and semantics solutions for visual sensemaking (in the backdrop of autonomous driving). A general method for online visual sensemaking using answer set programming is systematically formalised and fully implemented. The method integrates state of the art in (deep learning bas...
Article
Full-text available
How do the limits of high-level visual processing affect human performance in naturalistic, dynamic settings of (multimodal) interaction where observers can draw on experience to strategically adapt attention to familiar forms of complexity? In this backdrop, we investigate change detection in a driving context to study attentional allocation aimed...
Article
The new DLR Institute of Systems Engineering for Future Mobility (DLR SE) opened its doors at the beginning of 2022. As the new DLR institute emerged from the former OFFIS Division Transportation, it can draw on more than 30 years of experience in the research field on safety critical systems. With the transition to the German Aerospace Center (DLR...
Chapter
Full-text available
We address computational cognitive vision and perception at the interface of language, logic, cognition, and artificial intelligence. The chapter presents general methods for the processing and semantic interpretation of dynamic visuospatial imagery with a particular emphasis on the ability to abstract, learn, and reason with cognitively rooted str...
Conference Paper
The study of event perception emphasizes the importance of visuospatial attributes in everyday human activities and how they influence event segmentation, prediction and retrieval. Attending to these visuospatial attributes is the first step toward event understanding, and therefore correlating attentional measures to such attributes would help to...
Presentation
Full-text available
Presentation of the conference paper "Challenges in Achieving Explainability for Cooperative Transportation Systems" at the local hub of the "Second International Workshop on Requirements Engineering for Explainable Systems (RE4ES)"
Thesis
Perceptual sensemaking of dynamic visual imagery, e.g., involving semantic grounding, explanation, and learning, is central to a range of tasks where artificial intelligent systems have to make decisions and interact with humans. Towards this, commonsense characterisations of space and motion encompassing spatio-temporal relations, motion patterns,...
Conference Paper
Full-text available
We position recent and emerging research in cognitive vision and perception addressing three key questions: (1) What kind of relational abstraction mechanisms are needed to perform (explainable) grounded inference --e.g., question-answering, qualitative generalisation, hypothetical reasoning-- relevant to embodied multimodal interaction? (2) How ca...
Conference Paper
We position ongoing research aimed at developing a general framework for structured spatio-temporal learning from multimodal human behavioural stimuli. The framework and its underlying general, modular methods serve as a model for the application of integrated (neural) visuo-auditory processing and (semantic) relational learning foundations for app...
Article
We demonstrate the need and potential of systematically integrated vision and semantics solutions for visual sensemaking in the backdrop of autonomous driving. A general neurosymbolic method for online visual sensemaking using answer set programming (ASP) is systematically formalised and fully implemented. The method integrates state of the art in...
Article
Full-text available
We demonstrate the need and potential of systematically integrated vision and semantics solutions for visual sensemaking in the backdrop of autonomous driving. A general neurosymbolic method for online visual sensemaking using answer set programming (ASP) is systematically formalised and fully implemented. The method integrates state of the art in...
Preprint
We demonstrate the need and potential of systematically integrated vision and semantics solutions for visual sensemaking in the backdrop of autonomous driving. A general neurosymbolic method for online visual sensemaking using answer set programming (ASP) is systematically formalised and fully implemented. The method integrates state of the art in...
Conference Paper
Full-text available
We develop a human-centred cognitive model of visuospatial complexity in everyday , naturalistic driving conditions. With a focus on visual perception, the model incorporates quantitative, structural, and dynamic attributes identifiable in the chosen context; the human-centred basis of the model lies in its behavioural evaluation with human subject...
Preprint
We develop a human-centred, cognitive model of visuospatial complexity in everyday, naturalistic driving conditions. With a focus on visual perception, the model incorporates quantitative, structural, and dynamic attributes identifiable in the chosen context; the human-centred basis of the model lies in its behavioural evaluation with human subject...
Conference Paper
Full-text available
Semantic interpretation of dynamic visuospatial imagery calls for a general and systematic integration of methods in knowledge representation and computer vision. Towards this, we highlight research articulating & developing "deep semantics", characterised by the existence of declarative models --e.g., pertaining "space and motion"-- and correspond...
Conference Paper
Full-text available
Within the autonomous driving domain, there is now a clear need and tremendous potential for hybrid solutions (e.g., integrating semantics, learning, visual computing) towards fulfilling essential legal and ethical responsibilities involving explainability (e.g., for diagnosis), human-centred AI (e.g., interaction design), and industrial standardis...
Preprint
We demonstrate the need and potential of systematically integrated vision and semantics} solutions for visual sensemaking (in the backdrop of autonomous driving). A general method for online visual sensemaking using answer set programming is systematically formalised and fully implemented. The method integrates state of the art in (deep learning ba...
Conference Paper
Full-text available
High-level semantic interpretation of (dynamic) visual imagery calls for general and systematic methods integrating techniques in knowledge representation and computer vision. Towards this, we position "deep semantics", denoting the existence of declarative models --e.g., pertaining "space and motion"-- and corresponding formalisation and methods s...
Preprint
Full-text available
MINDS. MOVEMENT. MOVING IMAGE. consists of two symposia held as part of the 7th International Conference on Spatial Cognition (ICSC 2018). September 10-14, 2018; Rome (Italy) / Convener: Mehul Bhatt., www.icsc-rome.org. Symposium 1: Spatial Cognition and the Built Environment Symposium 2: Visuo-Auditory Perception and the Moving Image Speakers a...
Chapter
We present ASP Modulo ‘Space-Time’, a declarative representational and computational framework to perform commonsense reasoning about regions with both spatial and temporal components. Supported are capabilities for mixed qualitative-quantitative reasoning, consistency checking, and inferring compositions of space-time relations; these capabilities...
Preprint
We present ASP Modulo `Space-Time', a declarative representational and computational framework to perform commonsense reasoning about regions with both spatial and temporal components. Supported are capabilities for mixed qualitative-quantitative reasoning, consistency checking, and inferring compositions of space-time relations; these capabilities...
Article
Full-text available
We propose a hybrid architecture for systematically computing robust visual explanation(s) encompassing hypothesis formation, belief revision, and default reasoning with video data. The architecture consists of two tightly integrated synergistic components: (1) (functional) answer set programming based abductive reasoning with space-time tracklets...
Conference Paper
Full-text available
We propose a deep semantic characterization of space and motion categorically from the viewpoint of grounding embodied human-object interactions. Our key focus is on an ontological model that would be adept to formalisation from the viewpoint of commonsense knowledge representation, relational learning, and qualitative reasoning about space and mot...
Conference Paper
We propose a deep semantic characterisation of space and motion categorically from the viewpoint of grounding embodied human-object interactions. Our key focus is on an ontological model that would be adept to formalisation from the viewpoint of commonsense knowledge representation, relational learning, and qualitative reasoning about space and mot...
Conference Paper
Full-text available
We present a commonsense, qualitative model for the semantic grounding of embodied visuo-spatial and locomotive interactions. The key contribution is an integrative methodology combining low-level visual processing with high-level, human-centred representations of space and motion rooted in artificial intelligence. We demonstrate practical applicab...
Article
We present a commonsense theory of space and motion for representing and reasoning about motion patterns in video data, to perform declarative (deep) semantic interpretation of visuo-spatial sensor data, e.g., coming from object tracking, eye tracking data, movement trajectories. The theory has been implemented within constraint logic programming t...
Conference Paper
In this paper we present a novel framework and full implementation of probabilistic spatial reasoning within a Logic Programming context. The crux of our approach is extending Probabilistic Logic Programming (based on distribution semantics) to support reasoning over spatial variables via Constraint Logic Programming. Spatial reasoning is formulate...
Conference Paper
Full-text available
We present an inductive spatio-temporal learning framework rooted in inductive logic programming. With an emphasis on visuo-spatial language, logic, and cognition, the framework supports learning with relational spatio-temporal features identifiable in a range of domains involving the processing and interpretation of dynamic visuo-spatial imagery....
Article
Full-text available
We present an inductive spatio-temporal learning framework rooted in inductive logic programming. With an emphasis on visuo-spatial language, logic, and cognition, the framework supports learning with relational spatio-temporal features identifiable in a range of domains involving the processing and interpretation of dynamic visuo-spatial imagery....
Chapter
Full-text available
This paper presents a computational model of the processing of dynamic spatial relations occurring in an embodied robotic interaction setup. A complete system is introduced that allows autonomous robots to produce and interpret dynamic spatial phrases (in English) given an environment of moving objects. The model unites two separate research strand...
Conference Paper
Full-text available
Evidence-based design (EBD) for architecture involves the study of post-occupancy behaviour of building users with the aim to provide an empirical basis for improving building performance [Hamilton and Watkins 2009]. Within EBD, the high-level, qualitative analysis of the embodied visuo-locomotive experience of representative groups of building use...
Conference Paper
This research is driven by visuo-spatial perception focussed cognitive film studies, where the key emphasis is on the systematic study and generation of evidence that can characterise and establish correlates between principles for the synthesis of the moving image, and its cognitive (e.g., embodied visuo-auditory, emotional) recipient effects on o...
Article
The evidence-based analysis of people's navigation and wayfinding behaviour in large-scale built-up environments (e.g., hospitals, airports) encompasses the measurement and qualitative analysis of a range of aspects including people's visual perception in new and familiar surroundings, their decision-making procedures and intentions, the affordance...
Article
Full-text available
We present a general theory and corresponding declarative model for the embodied grounding and natural language based analytical summarisation of dynamic visuo-spatial imagery. The declarative model ---ecompassing spatio-linguistic abstractions, image schemas, and a spatio-temporal feature based language generator--- is modularly implemented within...
Conference Paper
Full-text available
This paper presents a computational model of the processing of dynamic spatial relations occurring in an embodied robotic interaction setup. A complete system is introduced that allows autonomous robots to produce and interpret dynamic spatial phrases (in English) given an environment of moving objects. The model unites two separate research strand...
Conference Paper
Full-text available
We propose a commonsense theory of space and motion for the high-level semantic interpretation of dynamic scenes. The theory provides primitives for commonsense representation and reasoning with qualitative spatial relations, depth profiles, and spatio-temporal change; these may be combined with probabilistic methods for modelling and hypothesising...
Article
Full-text available
We position a narrative-centred computational model for high-level knowledge representation and reasoning in the context of a range of assistive technologies concerned with "visuo-spatial perception and cognition" tasks. Our proposed narrative model encompasses aspects such as \emph{space, events, actions, change, and interaction} from the viewpoin...
Article
Full-text available
We construe smart meeting cinematography with a focus on professional situations such as meetings and seminars, possibly conducted in a distributed manner across socio-spatially separated groups. The basic objective in smart meeting cinematography is to interpret professional interactions involving people, and automatically produce dynamic recordin...
Conference Paper
Spatial assistance systems designed to empower people in smart environments need to perceive their operational environment, recognize activities performed in the environment, and reason about the observed information in order to plan a course of action. Activities performed by humans are spatio-temporal interactions between a subject, objects, and...

Network

Cited By