Martin Butz

Martin Butz
University of Tuebingen | EKU Tübingen · Department of Computer Science

PhD

About

361
Publications
70,311
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,337
Citations
Additional affiliations
January 2012 - present
University of Tuebingen
Position
  • Professor (Full)

Publications

Publications (361)
Article
Full-text available
Mental representations of the environment in infants are sparse and grow richer during their development. Anticipatory eye fixation studies show that infants aged around 7 months start to predict the goal of an observed action, e.g., an object targeted by a reaching hand. Interestingly, goal-predictive gaze shifts occur at an earlier age when the h...
Article
Full-text available
We present a parsimonious deep learning weather prediction model to forecast seven atmospheric variables with 3‐hr time resolution for up to 1‐year lead times on a 110‐km global mesh using the Hierarchical Equal Area isoLatitude Pixelization (HEALPix). In comparison to state‐of‐the‐art (SOTA) machine learning (ML) weather forecast models, such as P...
Preprint
Many studies have demonstrated spatial-numerical associations, but the debate about their origin is still ongoing. Some approaches consider cardinality representations in long-term memory, such as a Mental Number Line, while others suggest ordinality representations, for both numerical and non-numerical stimuli, originating in working or long-term...
Preprint
Full-text available
Mental representations in infants are sparse and grow richer over development. Anticipatory eye fixation studies show that infants at around 7 months start to predict the goal of an observed action, e.g., an object targeted by a reaching hand. Interestingly, goal-predictive gaze shifts occur at an earlier age when the (animate) hand subsequently ma...
Article
Full-text available
Magnitude information, for instance, regarding weight, distance, or velocity, is crucial for planning goal-directed interactions. Accordingly, magnitude information, including numerical magnitude, can affect actions: Responses to small numbers are faster with the left hand than the right and vice versa (hand-based SNARC effect). Previous experiment...
Article
Full-text available
The finite volume neural network (FINN) is an exception among recent physics-aware neural network models as it allows the specification of arbitrary boundary conditions (BCs). FINN can generalize and adapt to various prescribed BC values not provided during training, where other models fail. However, FINN depends explicitly on given BC values and c...
Preprint
Full-text available
Deep learning has recently gained immense popularity in the Earth sciences as it enables us to formulate purely data-driven models of complex Earth system processes. Deep learning-based weather prediction (DLWP) models have made significant progress in the last few years, achieving forecast skills comparable to established numerical weather predict...
Article
Full-text available
Human behavioral choices can reveal intrinsic and extrinsic decision-influencing factors. We investigate the inference of choice priors in situations of referential ambiguity. In particular, we use the scenario of signaling games and investigate to which extent study participants profit from actively engaging in the task. Previous work has revealed...
Article
Full-text available
The Spatial-Numerical Association of Response Codes (SNARC) effect-i.e., faster responses to small numbers with the left compared to the right side and to large numbers with the right compared to the left side-suggests that numbers are associated with space. However, it remains unclear whether the SNARC effect evolves from a number's magnitude or t...
Preprint
Full-text available
Accounting for how the human mind represents the internal and external world is a crucial feature of many theories of human cognition. Central to this question is the distinction between modal as opposed to amodal representational formats. It has often been assumed that one but not both of these two types of representations underlies processing in...
Article
Full-text available
Improved understanding of complex hydrosystem processes is key to advance water resources research. Nevertheless, the conventional way of modeling these processes suffers from a high conceptual uncertainty, due to almost ubiquitous simplifying assumptions used in model parameterizations/closures. Machine learning (ML) models are considered as a pot...
Article
Full-text available
Measurements of three flux towers operated during the land atmosphere feedback experiment (LAFE) are used to investigate relationships between surface fluxes and variables of the land–atmosphere system. We study these relations by means of two machine learning (ML) techniques: multilayer perceptrons (MLP) and extreme gradient boosting (XGB). We com...
Chapter
When modeling physical processes in spatially confined domains, the boundaries require distinct consideration through specifying appropriate boundary conditions (BCs). The finite volume neural network (FINN) is an exception among recent physics-aware neural network models: it allows the specification of arbitrary BCs. FINN is even able to generaliz...
Preprint
According to cognitive psychology and related disciplines, the development of complex problem-solving behaviour in biological agents depends on hierarchical cognitive mechanisms. Hierarchical reinforcement learning is a promising computational approach that may eventually yield comparable problem-solving behaviour in artificial agents and robots. H...
Article
Full-text available
Flexible, goal-directed behavior is a fundamental aspect of human life. Based on the free energy minimization principle, the theory of active inference formalizes the generation of such behavior from a computational neuroscience perspective. Based on the theory, we introduce an output-probabilistic, temporally predictive, modular artificial neural...
Article
With the increasing effectiveness of one-/few-shot learning techniques in the context of handwritten character generation and recognition, the call to extend the commonly associated Omniglot challenge is becoming more pressing. However, the sequential Omniglot dataset represents unrealistically written characters. Therefore, we present new data, a...
Conference Paper
Learning to write is characterized by bottom-up mimicking of characters and top-down writing from memory. We introduce a CNN-RNN model that implements both pathways: It can (i) directly write a letter by generating a motion trajectory given an image, (ii) first classify the character in the image and then determine its motion trajectory `from memor...
Article
Full-text available
Pursuing a precise, focused train of thought requires cognitive effort. Even more effort is necessary when more alternatives need to be considered or when the imagined situation becomes more complex. Cognitive resources available to us limit the cognitive effort we can spend. In line with previous work, an information-theoretic, Bayesian brain appr...
Preprint
Full-text available
Humans can make predictions on various time scales and hierarchical levels. Thereby, the learning of event encodings seems to play a crucial role. In this work we model the development of hierarchical predictions via autonomously learned latent event codes. We present a hierarchical recurrent neural network architecture, whose inductive learning bi...
Preprint
Full-text available
To effectively perceive and process observations in our environment, feature binding and perspective taking are crucial cognitive abilities. Feature binding combines observed features into one entity, called a Gestalt. Perspective taking transfers the percept into a canonical, observer-centered frame of reference. Here we propose a recurrent neural...
Preprint
Full-text available
Our brain can almost effortlessly decompose visual data streams into background and salient objects. Moreover, it can track the objects and anticipate their motion and interactions. In contrast, recent object reasoning datasets, such as CATER, have revealed fundamental shortcomings of current vision-based AI systems, particularly when targeting exp...
Preprint
The choice of anaphoric reference is a complex process regulated by a combination of linguistic and cognitive constraints. This paper experimentally addresses the impact of world knowledge on the types of references speakers produce, focusing on the predictability of event progressions. In order to avoid confounding event predictability and the pre...
Article
There is increasing effort to integrate Computational Thinking (CT) curricula across all education levels. Therefore, research on CT assessment has lately progressed towards developing and validating reliable CT assessment tools, which are crucial for evaluating students' potential learning progress and thus the effectiveness of suggested curricula...
Article
Full-text available
Time series data is often composed of a multitude of individual, superimposed dynamics. We propose a novel algorithm for inferring time series compositions through evolutionary synchronization of modular networks (ESMoN). ESMoN orchestrates a set of trained dynamic modules, assuming that some of those modules’ dynamics, suitably parameterized, will...
Preprint
Flexible, goal-directed behavior is a fundamental aspect of human life. Based on the free energy minimization principle, the theory of active inference formalizes the generation of such behavior from a computational neuroscience perspective. Based on the theory, we introduce an output-probabilistic, temporally predictive, modular artificial neural...
Article
Full-text available
Currently, there is neither a standardized mode for the documentation of phantom sensations and phantom limb pain, nor for their visualization as perceived by patients. We have therefore created a tool that allows for both, as well as for the quantification of the patient's visible and invisible body image. A first version provides the principal fu...
Article
The article can be viewed here: https://rdcu.be/cGZ43 According to cognitive psychology and related disciplines, the development of complex problem-solving behaviour in biological agents depends on hierarchical cognitive mechanisms. Hierarchical reinforcement learning is a promising computational approach that may eventually yield comparable pro...
Chapter
Full-text available
While the symbol grounding problem of agreeing on a mapping between symbols and sensory or even sensorimotor grounded concepts has been solved to a large extent, one possibly even deeper open problem remains: How do concepts and compositional concept structures develop in the first place? Concepts may be described as integrative mental representati...
Preprint
One of the most fundamental effects used to investigate number representations is the Spatial-Numerical Association of Response Codes (SNARC) effect showing that responses to small/large numbers are faster with the left/right hand, respectively. However, in recent years, it is hotly debated whether the SNARC effect is based upon cardinal representa...
Preprint
Full-text available
We introduce a compositional physics-aware neural network (FINN) for learning spatiotemporal advection-diffusion processes. FINN implements a new way of combining the learning abilities of artificial neural networks with physical and structural knowledge from numerical simulation by modeling the constituents of partial differential equations (PDEs)...
Preprint
Full-text available
A common approach to prediction and planning in partially observable domains is to use recurrent neural networks (RNNs), which ideally develop and maintain a latent memory about hidden, task-relevant factors. We hypothesize that many of these hidden factors in the physical world are constant over time, changing only sparsely. Accordingly, we propos...
Article
Bayesian accounts of social cognition successfully model the human ability to infer goals and intentions of others on the basis of their behavior. In this paper, we extend this paradigm to the analysis of ambiguity resolution during brief communicative exchanges. In a reference game experimental setup, we observed that participants were able to inf...
Chapter
Active Tuning is an optimization paradigm specifically designed to increase the robustness and generalization ability of temporal forward models like recurrent neural networks (RNNs). This work explores how the Active Tuning method can be used to optimize the internal dynamics of recurrent spiking neural networks (RSNNs). Active Tuning decouples th...
Chapter
The ability to flexibly bind features into coherent wholes from different perspectives is a hallmark of cognition and intelligence. This binding problem is not only relevant for vision but also for general intelligence, sensorimotor integration, event processing, and language. Various artificial neural network models have tackled this problem. Here...
Chapter
In this paper, we demonstrate that goal-directed behavior unfolds in recurrent spiking neural networks (RSNNs) when intentions are projected onto continuously progressing spike dynamics encoding the recent history of an agent’s state. The projections, which can either be realized via backpropagation through time (BPTT) over a certain time window or...
Chapter
Knowledge about the hidden factors that determine particular system dynamics is crucial for both explaining them and pursuing goal-directed interventions. Inferring these factors from time series data without supervision remains an open challenge. Here, we focus on spatiotemporal processes, including wave propagation and weather dynamics, for which...
Chapter
The ability to develop representations of components and to recombine them in a new but compositionally meaningful manner is considered a hallmark of human cognition, which has not been reached by machines, yet. The Omniglot challenge taps into this deficit by posing several one-shot/few-shot generation and classification tasks of handwritten chara...
Article
Full-text available
From about 7 months of age onward, infants start to reliably fixate the goal of an observed action, such as a grasp, before the action is complete. The available research has identified a variety of factors that influence such goal-anticipatory gaze shifts, including the experience with the shown action events and familiarity with the observed agen...
Article
Full-text available
During the observation of goal-directed actions, infants usually predict the goal at an earlier age when the agent is familiar (e.g., human hand) compared to unfamiliar (e.g., mechanical claw). These findings implicate a crucial role of the developing agentive self for infants’ processing of others’ action goals. Recent theoretical accounts suggest...
Conference Paper
Full-text available
We describe an inference principle for speech resynthesis using the vocal tract simulator VocalTractLab (VTL). Our method generates smooth and plausible motor trajectories controlling the vocal tract simulator. The method utilizes a differentiable forward model approximation of the VTL, namely, an LSTM that learned the involved temporal motor-acous...
Preprint
Full-text available
Purpose: Attractive food elicits approaching behavior, which could be directly assessed in a combination of Virtual Reality (VR) with online motion-capture. Thus, VR enables the assessment of motivated approach and avoidance behavior towards food and non-food cues in controlled laboratory environments. Aim of this study was to test the specificity...
Preprint
Full-text available
Purpose: Theoretical models and behavioral studies indicate faster approach behavior for high-calorie food (approach bias) among healthy participants. A previous study with Virtual Reality (VR) and online motion-capture quantified this approach bias towards food and non-food cues in a controlled VR environment with hand movements. The aim of this s...
Preprint
Full-text available
A critical challenge for any intelligent system is to infer structure from continuous data streams. Theories of event-predictive cognition suggest that the brain segments sensorimotor information into compact event encodings, which are used to anticipate and interpret environmental dynamics. Here, we introduce a SUrprise-GAted Recurrent neural netw...
Preprint
Full-text available
Data-driven modeling of spatiotemporal physical processes with general deep learning methods is a highly challenging task. It is further exacerbated by the limited availability of data, leading to poor generalizations in standard neural network models. To tackle this issue, we introduce a new approach called the Finite Volume Neural Network (FINN)....
Article
Full-text available
Strong AI—artificial intelligence that is in all respects at least as intelligent as humans—is still out of reach. Current AI lacks common sense, that is, it is not able to infer, understand, or explain the hidden processes, forces, and causes behind data. Main stream machine learning research on deep artificial neural networks (ANNs) may even be c...
Preprint
During the observation of goal-directed actions, infants usually predict the goal when the action and the agent are familiar, but they do not as easily predict the goal when the action or the agent are unfamiliar. Recent theoretical accounts suggest that predictive gaze behavior relies on a complex interplay between bottom-up- (e.g., agency cues) a...
Preprint
From about six months of age onwards, infants start to reliably fixate the goal of an observed action, such as a grasp, before the action is complete. The available research has identified a variety of factors that influence such goal-anticipatory gaze shifts, including the experience with the shown action events and familiarity with the observed a...
Preprint
Cognitive Psychology and related disciplines have identified several critical mechanisms that enable intelligent biological agents to learn to solve complex problems. There exists pressing evidence that the cognitive mechanisms that enable problem-solving skills in these species build on hierarchical mental representations. Among the most promising...
Preprint
Full-text available
The ability to flexibly bind features into coherent wholes from different perspectives is a hallmark of cognition and intelligence. Importantly, the binding problem is not only relevant for vision but also for general intelligence, sensorimotor integration, event processing, and language. Various artificial neural network models have tackled this p...
Article
Full-text available
Our minds navigate a continuous stream of sensorimotor experiences, selectively compressing them into events. Event‐predictive encodings and processing abilities have evolved because they mirror interactions between agents and objects—and the pursuance or avoidance of critical interactions lies at the heart of survival and reproduction. However, it...
Chapter
The novel DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network. It implements a grid or mesh of locally parameterizable laterally connected network modules. DISTANA is specifically designed to identify the causality behind spatially distributed, non-linear dynamical processes. We s...
Chapter
When comparing human with artificial intelligence, one major difference is apparent: Humans can generalize very broadly from sparse data sets because they are able to recombine and reintegrate data components in compositional manners. To investigate differences in efficient learning, Joshua B. Tenenbaum and colleagues developed the character challe...
Chapter
Our brain receives a dynamically changing stream of sensorimotor data. Yet, we perceive a rather organized world, which we segment into and perceive as events. Computational theories of cognitive science on event-predictive cognition suggest that our brain forms generative, event-predictive models by segmenting sensorimotor data into suitable chunk...
Preprint
Full-text available
We introduce Active Tuning, a novel paradigm for optimizing the internal dynamics of recurrent neural networks (RNNs) on the fly. In contrast to the conventional sequence-to-sequence mapping scheme, Active Tuning decouples the RNN's recurrent neural activities from the input stream, using the unfolding temporal gradient signal to tune the internal...
Chapter
Full-text available
Recent research in the field of spiking neural networks (SNNs) has shown that recurrent variants of SNNs, namely long short-term SNNs (LSNNs), can be trained via error gradients just as effective as LSTMs. The underlying learning method (e-prop) is based on a formalization of eligibility traces applied to leaky integrate and fire (LIF) neurons. Her...
Preprint
Full-text available
Knowledge of the hidden factors that determine particular system dynamics is crucial for both explaining them and pursuing goal-directed, interventional actions. The inference of these factors without supervision given time series data remains an open challenge. Here, we focus on spatio-temporal processes, including wave propagations and weather dy...
Preprint
Full-text available
The novel DISTributed Artificial neural Network Architecture (DISTANA) is a generative, recurrent graph convolution neural network. It implements a grid or mesh of locally parameterizable laterally connected network modules. DISTANA is specifically designed to identify the causality behind spatially distributed, non-linear dynamical processes. We s...
Preprint
Full-text available
Our brain receives a dynamically changing stream of sensorimotor data. Yet, we perceive a rather organized world, which we segment into and perceive as events. Computational theories of cognitive science on event-predictive cognition suggest that our brain forms generative, event-predictive models by segmenting sensorimotor data into suitable chunk...
Preprint
An utterance is referentially ambiguous if it has several potential referents. Observing how listeners make choices among those referents can reveal their hidden beliefs and preferences, besides giving hints at their responding processes. We asked subjects to observe how one of the objects is chosen following a possibly ambiguous utterance and to i...
Preprint
Full-text available
Recent research in the field of spiking neural networks (SNNs) has shown that recurrent variants of SNNs, namely long short-term SNNs (LSNNs), can be trained via error gradients just as effective as LSTMs. The underlying learning method (e-prop) is based on a formalization of eligibility traces applied to leaky integrate and fire (LIF) neurons. Her...
Preprint
Full-text available
When comparing human with artificial intelligence, one major difference is apparent: Humans can generalize very broadly from sparse data sets because they are able to recombine and reintegrate data components in compositional manners. To investigate differences in efficient learning, Joshua B. Tenenbaum and colleagues developed the character challe...
Preprint
Full-text available
We introduce a distributed spatio-temporal artificial neural network architecture (DISTANA). It encodes mesh nodes using recurrent, neural prediction kernels (PKs), while neural transition kernels (TKs) transfer information between neighboring PKs, together modeling and predicting spatio-temporal time series dynamics. As a consequence, DISTANA assu...
Conference Paper
Full-text available
Coding as a practical skill and computational thinking (CT) as a cognitive ability have become an important topic in education and research. It has been suggested that CT, as an early predictor of academic success, should be introduced and fostered early in education. However, there is no consensus on the underlying cognitive correlates of CT in yo...
Chapter
Full-text available
The recently introduced REtrospective and PRospective Inference SchEme (REPRISE) infers contextual event states in the form of neural parametric biases retrospectively in recurrent neural networks (RNNs), distinguishing, for example, different sensorimotor control dynamics. Moreover, it actively infers motor commands prospectively in a goal-directe...
Chapter
Full-text available
In this paper we investigate how directional distance signals can be incorporated in RNN-based adaptive goal-direction behavior inference mechanisms, which is closely related to formalizations of active inference. It was shown previously that RNNs can be used to effectively infer goal-directed action control policies online. This is achieved by pro...