Martin Butz

Martin Butz
University of Tuebingen | EKU Tübingen · Department of Computer Science

PhD

About

336
Publications
48,128
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,306
Citations
Additional affiliations
January 2012 - present
University of Tuebingen
Position
  • Professor (Full)

Publications

Publications (336)
Preprint
Full-text available
Humans can make predictions on various time scales and hierarchical levels. Thereby, the learning of event encodings seems to play a crucial role. In this work we model the development of hierarchical predictions via autonomously learned latent event codes. We present a hierarchical recurrent neural network architecture, whose inductive learning bi...
Preprint
Full-text available
To effectively perceive and process observations in our environment, feature binding and perspective taking are crucial cognitive abilities. Feature binding combines observed features into one entity, called a Gestalt. Perspective taking transfers the percept into a canonical, observer-centered frame of reference. Here we propose a recurrent neural...
Preprint
Full-text available
Our brain can almost effortlessly decompose visual data streams into background and salient objects. Moreover, it can track the objects and anticipate their motion and interactions. In contrast, recent object reasoning datasets, such as CATER, have revealed fundamental shortcomings of current vision-based AI systems, particularly when targeting exp...
Preprint
The choice of anaphoric reference is a complex process regulated by a combination of linguistic and cognitive constraints. This paper experimentally addresses the impact of world knowledge on the types of references speakers produce, focusing on the predictability of event progressions. In order to avoid confounding event predictability and the pre...
Article
There is increasing effort to integrate Computational Thinking (CT) curricula across all education levels. Therefore, research on CT assessment has lately progressed towards developing and validating reliable CT assessment tools, which are crucial for evaluating students' potential learning progress and thus the effectiveness of suggested curricula...
Article
Full-text available
Time series data is often composed of a multitude of individual, superimposed dynamics. We propose a novel algorithm for inferring time series compositions through evolutionary synchronization of modular networks (ESMoN). ESMoN orchestrates a set of trained dynamic modules, assuming that some of those modules’ dynamics, suitably parameterized, will...
Preprint
Flexible, goal-directed behavior is a fundamental aspect of human life. Based on the free energy minimization principle, the theory of active inference formalizes the generation of such behavior from a computational neuroscience perspective. Based on the theory, we introduce an output-probabilistic, temporally predictive, modular artificial neural...
Article
Full-text available
Currently, there is neither a standardized mode for the documentation of phantom sensations and phantom limb pain, nor for their visualization as perceived by patients. We have therefore created a tool that allows for both, as well as for the quantification of the patient's visible and invisible body image. A first version provides the principal fu...
Article
The article can be viewed here: https://rdcu.be/cGZ43 According to cognitive psychology and related disciplines, the development of complex problem-solving behaviour in biological agents depends on hierarchical cognitive mechanisms. Hierarchical reinforcement learning is a promising computational approach that may eventually yield comparable pro...
Preprint
Full-text available
We introduce a compositional physics-aware neural network (FINN) for learning spatiotemporal advection-diffusion processes. FINN implements a new way of combining the learning abilities of artificial neural networks with physical and structural knowledge from numerical simulation by modeling the constituents of partial differential equations (PDEs)...
Preprint
Full-text available
A common approach to prediction and planning in partially observable domains is to use recurrent neural networks (RNNs), which ideally develop and maintain a latent memory about hidden, task-relevant factors. We hypothesize that many of these hidden factors in the physical world are constant over time, changing only sparsely. Accordingly, we propos...
Article
Bayesian accounts of social cognition successfully model the human ability to infer goals and intentions of others on the basis of their behavior. In this paper, we extend this paradigm to the analysis of ambiguity resolution during brief communicative exchanges. In a reference game experimental setup, we observed that participants were able to inf...
Chapter
Active Tuning is an optimization paradigm specifically designed to increase the robustness and generalization ability of temporal forward models like recurrent neural networks (RNNs). This work explores how the Active Tuning method can be used to optimize the internal dynamics of recurrent spiking neural networks (RSNNs). Active Tuning decouples th...
Chapter
The ability to flexibly bind features into coherent wholes from different perspectives is a hallmark of cognition and intelligence. This binding problem is not only relevant for vision but also for general intelligence, sensorimotor integration, event processing, and language. Various artificial neural network models have tackled this problem. Here...
Chapter
In this paper, we demonstrate that goal-directed behavior unfolds in recurrent spiking neural networks (RSNNs) when intentions are projected onto continuously progressing spike dynamics encoding the recent history of an agent’s state. The projections, which can either be realized via backpropagation through time (BPTT) over a certain time window or...
Chapter
Knowledge about the hidden factors that determine particular system dynamics is crucial for both explaining them and pursuing goal-directed interventions. Inferring these factors from time series data without supervision remains an open challenge. Here, we focus on spatiotemporal processes, including wave propagation and weather dynamics, for which...
Chapter
The ability to develop representations of components and to recombine them in a new but compositionally meaningful manner is considered a hallmark of human cognition, which has not been reached by machines, yet. The Omniglot challenge taps into this deficit by posing several one-shot/few-shot generation and classification tasks of handwritten chara...
Article
Full-text available
From about 7 months of age onward, infants start to reliably fixate the goal of an observed action, such as a grasp, before the action is complete. The available research has identified a variety of factors that influence such goal-anticipatory gaze shifts, including the experience with the shown action events and familiarity with the observed agen...
Article
Full-text available
During the observation of goal-directed actions, infants usually predict the goal at an earlier age when the agent is familiar (e.g., human hand) compared to unfamiliar (e.g., mechanical claw). These findings implicate a crucial role of the developing agentive self for infants' processing of others' action goals. Recent theoretical accounts suggest...
Conference Paper
Full-text available
We describe an inference principle for speech resynthesis using the vocal tract simulator VocalTractLab (VTL). Our method generates smooth and plausible motor trajectories controlling the vocal tract simulator. The method utilizes a differentiable forward model approximation of the VTL, namely, an LSTM that learned the involved temporal motor-acous...
Preprint
Full-text available
Purpose: Attractive food elicits approaching behavior, which could be directly assessed in a combination of Virtual Reality (VR) with online motion-capture. Thus, VR enables the assessment of motivated approach and avoidance behavior towards food and non-food cues in controlled laboratory environments. Aim of this study was to test the specificity...
Preprint
Purpose : Theoretical models and behavioral studies indicate faster approach behavior for high-calorie food (approach bias) among healthy participants. A previous study with Virtual Reality (VR) and online motion-capture quantified this approach bias towards food and non-food cues in a controlled VR environment with hand movements. The aim of this...
Preprint
Full-text available
A critical challenge for any intelligent system is to infer structure from continuous data streams. Theories of event-predictive cognition suggest that the brain segments sensorimotor information into compact event encodings, which are used to anticipate and interpret environmental dynamics. Here, we introduce a SUrprise-GAted Recurrent neural netw...
Preprint
Full-text available
Data-driven modeling of spatiotemporal physical processes with general deep learning methods is a highly challenging task. It is further exacerbated by the limited availability of data, leading to poor generalizations in standard neural network models. To tackle this issue, we introduce a new approach called the Finite Volume Neural Network (FINN)....
Article
Full-text available
Strong AI—artificial intelligence that is in all respects at least as intelligent as humans—is still out of reach. Current AI lacks common sense, that is, it is not able to infer, understand, or explain the hidden processes, forces, and causes behind data. Main stream machine learning research on deep artificial neural networks (ANNs) may even be c...
Preprint
During the observation of goal-directed actions, infants usually predict the goal when the action and the agent are familiar, but they do not as easily predict the goal when the action or the agent are unfamiliar. Recent theoretical accounts suggest that predictive gaze behavior relies on a complex interplay between bottom-up- (e.g., agency cues) a...
Preprint
From about six months of age onwards, infants start to reliably fixate the goal of an observed action, such as a grasp, before the action is complete. The available research has identified a variety of factors that influence such goal-anticipatory gaze shifts, including the experience with the shown action events and familiarity with the observed a...
Preprint
Cognitive Psychology and related disciplines have identified several critical mechanisms that enable intelligent biological agents to learn to solve complex problems. There exists pressing evidence that the cognitive mechanisms that enable problem-solving skills in these species build on hierarchical mental representations. Among the most promising...
Preprint
Full-text available
The ability to flexibly bind features into coherent wholes from different perspectives is a hallmark of cognition and intelligence. Importantly, the binding problem is not only relevant for vision but also for general intelligence, sensorimotor integration, event processing, and language. Various artificial neural network models have tackled this p...
Article
Full-text available
Our minds navigate a continuous stream of sensorimotor experiences, selectively compressing them into events. Event‐predictive encodings and processing abilities have evolved because they mirror interactions between agents and objects—and the pursuance or avoidance of critical interactions lies at the heart of survival and reproduction. However, it...
Chapter
The novel DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network. It implements a grid or mesh of locally parameterizable laterally connected network modules. DISTANA is specifically designed to identify the causality behind spatially distributed, non-linear dynamical processes. We s...
Chapter
When comparing human with artificial intelligence, one major difference is apparent: Humans can generalize very broadly from sparse data sets because they are able to recombine and reintegrate data components in compositional manners. To investigate differences in efficient learning, Joshua B. Tenenbaum and colleagues developed the character challe...
Chapter
Our brain receives a dynamically changing stream of sensorimotor data. Yet, we perceive a rather organized world, which we segment into and perceive as events. Computational theories of cognitive science on event-predictive cognition suggest that our brain forms generative, event-predictive models by segmenting sensorimotor data into suitable chunk...
Preprint
Full-text available
We introduce Active Tuning, a novel paradigm for optimizing the internal dynamics of recurrent neural networks (RNNs) on the fly. In contrast to the conventional sequence-to-sequence mapping scheme, Active Tuning decouples the RNN's recurrent neural activities from the input stream, using the unfolding temporal gradient signal to tune the internal...
Chapter
Full-text available
Recent research in the field of spiking neural networks (SNNs) has shown that recurrent variants of SNNs, namely long short-term SNNs (LSNNs), can be trained via error gradients just as effective as LSTMs. The underlying learning method (e-prop) is based on a formalization of eligibility traces applied to leaky integrate and fire (LIF) neurons. Her...
Preprint
Full-text available
Knowledge of the hidden factors that determine particular system dynamics is crucial for both explaining them and pursuing goal-directed, interventional actions. The inference of these factors without supervision given time series data remains an open challenge. Here, we focus on spatio-temporal processes, including wave propagations and weather dy...
Preprint
Full-text available
The novel DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network. It implements a grid or mesh of locally parameterizable laterally connected network modules. DISTANA is specifically designed to identify the causality behind spatially distributed, non-linear dynamical processes. We s...
Preprint
Full-text available
Our brain receives a dynamically changing stream of sensorimotor data. Yet, we perceive a rather organized world, which we segment into and perceive as events. Computational theories of cognitive science on event-predictive cognition suggest that our brain forms generative, event-predictive models by segmenting sensorimotor data into suitable chunk...
Preprint
An utterance is referentially ambiguous if it has several potential referents. Observing how listeners make choices among those referents can reveal their hidden beliefs and preferences, besides giving hints at their responding processes. We asked subjects to observe how one of the objects is chosen following a possibly ambiguous utterance and to i...
Preprint
Full-text available
Recent research in the field of spiking neural networks (SNNs) has shown that recurrent variants of SNNs, namely long short-term SNNs (LSNNs), can be trained via error gradients just as effective as LSTMs. The underlying learning method (e-prop) is based on a formalization of eligibility traces applied to leaky integrate and fire (LIF) neurons. Her...
Preprint
Full-text available
When comparing human with artificial intelligence, one major difference is apparent: Humans can generalize very broadly from sparse data sets because they are able to recombine and reintegrate data components in compositional manners. To investigate differences in efficient learning, Joshua B. Tenenbaum and colleagues developed the character challe...
Preprint
Full-text available
We introduce a distributed spatio-temporal artificial neural network architecture (DISTANA). It encodes mesh nodes using recurrent, neural prediction kernels (PKs), while neural transition kernels (TKs) transfer information between neighboring PKs, together modeling and predicting spatio-temporal time series dynamics. As a consequence, DISTANA assu...
Conference Paper
Full-text available
Coding as a practical skill and computational thinking (CT) as a cognitive ability have become an important topic in education and research. It has been suggested that CT, as an early predictor of academic success, should be introduced and fostered early in education. However, there is no consensus on the underlying cognitive correlates of CT in yo...
Chapter
Full-text available
The recently introduced REtrospective and PRospective Inference SchEme (REPRISE) infers contextual event states in the form of neural parametric biases retrospectively in recurrent neural networks (RNNs), distinguishing, for example, different sensorimotor control dynamics. Moreover, it actively infers motor commands prospectively in a goal-directe...
Chapter
Full-text available
In this paper we investigate how directional distance signals can be incorporated in RNN-based adaptive goal-direction behavior inference mechanisms, which is closely related to formalizations of active inference. It was shown previously that RNNs can be used to effectively infer goal-directed action control policies online. This is achieved by pro...
Chapter
Full-text available
Learning compositional dynamics with recurrent neural networks (RNNs) trained with back-propagation through time (BPTT) is usually a difficult task. Typically RNNs learn the consecutive shape along target sequences from time step to time step, focusing on local temporal correlations. When the challenge is to identify and model independent, unknown...
Article
Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experie...
Article
Full-text available
We introduce REPRISE, a REtrospective and PRospective Inference SchEme, which learns temporal event-predictive models of dynamical systems. REPRISE infers the unobservable contextual event state and accompanying temporal predictive models that best explain the recently encountered sensorimotor experiences retrospectively. Meanwhile, it optimizes up...
Article
Full-text available
According to theories of anticipatory behavior control, actions are initiated by predicting their sensory outcomes. From the perspective of event-predictive cognition and active inference, predictive processes activate currently desired events and event boundaries, as well as the expected sensorimotor mappings necessary to realize them, dependent o...
Conference Paper
Full-text available
We are addressing the challenge of learning an inverse mapping between acoustic features and control parameters of a vocal tract simulator. As a first step, we synthesize an articulatory corpus consisting of control parameters and wave forms using VocalTractLab (VTL; [1]) as the vocal tract simulator. The basis for the synthesis is a concatenative...
Preprint
Full-text available
Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experie...
Chapter
Full-text available
Robot arm control and motion planning in dynamically changing environments is a challenging task. It requires an adaptive planning algorithm that generates solutions on-the-fly, incorporating the current environmental conditions. This paper explores an alternative approach. Adaptive planning is realized in a generative Recurrent Neural Network (RNN...
Conference Paper
Full-text available
Prediction is believed to play an important role in the human brain. However, it is still unclear how predictions are used in the process of learning new movements. In this paper, we present a method to learn movements from visual prediction. The method consists of two phases: learning a visual prediction model for a given movement, then minimizing...
Preprint
Full-text available
We introduce a dynamic artificial neural network-based (ANN) adaptive inference process, which learns temporal predictive models of dynamical systems. We term the process REPRISE, a REtrospective and PRospective Inference SchEme. REPRISE infers the unobservable contextual state that best explains its recently encountered sensorimotor experiences as...
Article
Most studies on spatial memory refer to the horizontal plane, leaving an open question as to whether findings generalize to vertical spaces where gravity and the visual upright of our surrounding space are salient orientation cues. In three experiments, we examined which reference frame is used to organize memory for vertical locations: the one bas...
Article
Full-text available
Spatial, physical, and semantic magnitude dimensions can influence action decisions in human cognitive processing and interact with each other. For example, in the spatial-numerical associations of response code (SNARC) effect, semantic numerical magnitude facilitates left-hand or right-hand responding dependent on the small or large magnitude of n...
Article
It has been suggested that our mind anticipates the future to act in a goal-directed, event-oriented manner. Here we asked whether peripersonal hand space, that is, the space surrounding one's hands, is dynamically and adaptively mapped into the future while planning and executing a goal-directed object manipulation. We thus combined the crossmodal...
Conference Paper
Full-text available
In a recent study, it was demonstrated that Recurrent Neural Networks (RNNs) can be used to effectively control snake-like, many-joint robot arms in a particular way: The inverse kinematics for control are generated using back-propagation through time (BPTT) on recurrent forward models that learned to predict the end-effector pose of a robot arm, w...
Conference Paper
Full-text available
This paper shows that active-inference-based, flexible, adaptive goal-directed behavior can be generated by utilizing temporal gradients in a recurrent neural network (RNN). The RNN learns a dynamical sensorimotor forward model of a partially observable environment. It then uses this model to execute goal-directed policy inference online. The inter...
Conference Paper
Full-text available
Computational thinking (CT) denotes the idea of developing a generic solution to a problem by decomposing it, identifying relevant variables and patterns, and deriving an algorithmic solution procedure. As a general problem solving strategy, it has been suggested a fundamental cognitive competence to be acquired in education-comparable to literacy...