Stefan Heinrich

Stefan Heinrich
IT University of Copenhagen · Computer Science

Dr. rer. nat. Dipl.-Inform.

About

38
Publications
13,987
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
434
Citations
Introduction
http://stefanheinrich.net My research interest is located in between artificial intelligence, cognitive psychology, and computational neuroscience. I aim to explore the computational principles in the brain to foster our fundamental understanding of the brain’s mechanisms but also to exploit them in developing intelligent systems. In particular, I look into the processes in the brain that form representations in multi-modal integration up to in acquisition of complex cognitive functions.
Additional affiliations
March 2020 - present
The University of Tokyo
Position
  • PostDoc Position
Description
  • Independent research in cognitive modelling, cognitive developmental robotics, and machine learning; Project coordination; Grant proposal preparation and writing; PhD student supervision; Higher Education teaching & coaching.
April 2016 - December 2019
University of Hamburg
Position
  • PostDoc Position
Description
  • SFB/Transregio TRR 169 - Crossmodal Learning. Research in machine learning and human-robot interaction; Software development; Project coordination; Grant proposal preparation and writing; Curriculum and examination boards; PhD student supervision.
May 2010 - March 2016
University of Hamburg
Position
  • Research and Teaching Associate
Description
  • Higher education teaching and tutoring; Research in computational modelling, natural language processing, and developmental robotics; Grant proposal preparation and writing; Project coordination; Conference organisation
Education
January 2011 - June 2016
University of Hamburg
Field of study
  • Computer Science
October 2003 - October 2009
Universität Paderborn
Field of study
  • Computer Science, subsidary subject: Psychology

Publications

Publications (38)
Article
Full-text available
The human brain is one of the most complex dynamic systems that enables us to communicate in natural language. We have a good understanding of some principles underlying natural languages and language processing, some knowledge about socio-cultural conditions framing acquisition, and some insights about where activity is occurring in the brain. How...
Conference Paper
Full-text available
In developmental robotics, we model cognitive processes , such as body motion or language processing, and study them in natural real-world conditions. Naturally, these sequential processes inherently occur on different continuous timescales. Similar as our brain can cope with them by hierarchical abstraction and coupling of different processing mod...
Conference Paper
Full-text available
Recent improvements to Generative Adversarial Networks (GANs) have made it possible to generate realistic images in high resolution based on natural language descriptions such as image captions. However, fine-grained control of the image layout, i.e. where in the image specific objects should be located, is still difficult to achieve. We introduce...
Article
Full-text available
Human infants are able to acquire natural language seemingly easily at an early age. Their language learning seems to occur simultaneously with learning other cognitive functions as well as with playful interactions with the environment and caregivers. From a neuroscientific perspective, natural language is embodied, grounded in most, if not all, s...
Conference Paper
Full-text available
Recurrent neural networks that can capture temporal characteristics on multiple timescales are a key architecture in machine learning solutions as well as in neurocognitive models. A crucial open question is how these architectures can adopt both multi-term dependencies and systematic fluctuations from the data or from sensory input, similar to the...
Preprint
Full-text available
Generative adversarial networks conditioned on textual image descriptions are capable of generating realistic-looking images. However, current methods still struggle to generate images based on complex image captions from a heterogeneous domain. Furthermore, quantitatively evaluating these text-to-image models is challenging, as most evaluation met...
Preprint
Full-text available
Human infants are able to acquire natural language seemingly easily at an early age. Their language learning seems to occur simultaneously with learning other cognitive functions as well as with playful interactions with the environment and caregivers. From a neuroscientific perspective, natural language is embodied, grounded in most, if not all, s...
Article
Full-text available
To overcome novel challenges in complex domestic environments, humanoid robots can learn from human teachers. We propose that the capability for social interaction should be a key factor in this teaching process and benefits both the subjective experience of the human user and the learning process itself. To support our hypothesis, we present a Hum...
Article
Full-text available
In contrast to many established emotion recognition systems, convolutional neural networks do not rely on handcrafted features to categorize emotions. Although achieving state-of-the-art performances, it is still not fully understood what these networks learn and how the learned representations correlate with the emotional characteristics of speech...
Conference Paper
Full-text available
In recent years, there has been an increased interest in the role of social agents as language teachers. Our experiment was designed to investigate whether a physical agent in form of a social robot provides a better language-learning experience than a virtual agent. We evaluated the interactions regarding enjoyment, immersion, and vocabulary reten...
Article
Full-text available
The problem of generating structured Knowledge Graphs (KGs) is difficult and open but relevant to a range of tasks related to decision making and information augmentation. A promising approach is to study generating KGs as a relational representation of inputs (e.g., textual paragraphs or natural images), where nodes represent the entities and edge...
Conference Paper
Full-text available
Abstract—We present a unified visuomotor neural architecture for the robotic task of identifying, localizing, and grasping a goal object in a cluttered scene. The RetinaNet-based neural architecture enables end-to-end training of visuomotor abilities in a biological-inspired developmental approach. We demonstrate a successful development and evalua...
Conference Paper
Full-text available
We investigate hierarchical attention networks for the task of question answering. For this purpose, we propose two different approaches: in the first, a document vector representation is built hierarchically from word-to-sentence level which is then used to infer the right answer. In the second, pointer sum attention is utilized to directly infer...
Conference Paper
Full-text available
In this paper, we present an autonomous AI system designed for a Human-Robot Interaction (HRI) study, set around a dice game scenario. We conduct a case study to answer our research question: Does a robot with a socially engaged personality lead to a higher acceptance than a competitive personality? The flexibility of our proposed system allows us...
Article
Full-text available
A variety of brain areas is involved in language understanding and generation, accounting for the scope of language that can refer to many real-world matters. In this work, we investigate how regularities among real-world entities impact on emergent language representations. Specifically, we consider knowledge bases, which represent entities and th...
Article
Full-text available
Tracking arbitrary objects in natural environments is a challenging task in visual computing. A central problem is the need to adapt to changing appearances under strong transformation and occlusion. We propose a tracking framework that utilises the strength of Convolutional Neural Networks to create a robust and adaptive model of the object from t...
Preprint
Full-text available
Recent improvements to Generative Adversarial Networks (GANs) have made it possible to generate realistic images in high resolution based on natural language descriptions such as image captions. Furthermore, conditional GANs allow us to control the image generation process through labels or even natural language descriptions. However, fine-grained...
Conference Paper
Full-text available
Humans develop cognitive functions from a body-rational perspective. Particularly, infants develop representations through sensorimotor environmental interactions and goaldirected actions [1]. This embodiment plays a major role in modeling cognitive functions from active perception to natural language learning. For the developmental robotics commun...
Conference Paper
Full-text available
Tracking arbitrary objects is a challenging task in visual computing. A central problem is the need to adapt to the changing appearance of an object, particularly under strong transformation and occlusion. We propose a tracking framework that utilises the strengths of Convolutional Neural Networks (CNNs) to create a robust and adaptive model of the...
Article
Lake et al. point out that grounding learning in general principles of embodied perception and social cognition is the next step in advancing artificial intelligent machines. We suggest it is necessary to go further and consider lifelong learning, which includes developmental learning, focused on embodiment as applied in developmental robotics and...
Conference Paper
Full-text available
Advancements in Human-Robot Interaction involve robots being more responsive and adaptive to the human user they are interacting with. For example, robots model a personalised dialogue with humans, adapting the conversation to accommodate the user's preferences in order to allow natural interactions. This study investigates the impact of such perso...
Conference Paper
Full-text available
This paper describes the techniques used in the submitted video presenting an interaction scenario, realised using the Neuro-Inspired Companion (NICO) robot. NICO engages the users in a personalised conversation where the robot always tracks the users' face, remembers them and interacts with them using natural language. NICO can also learn to perfo...
Conference Paper
Full-text available
Interdisciplinary research, drawing from robotics, artificial intelligence, neuroscience, psychology, and cognitive science, is a cornerstone to advance the state-of-the-art in multimodal human-robot interaction and neuro-cognitive mod-eling. Research on neuro-cognitive models benefits from the embodiment of these models into physical, humanoid age...
Conference Paper
Full-text available
The human brain as one of the most complex dynamic systems enables us to communicate and externalise information by natural language. Despite extensive research, human-like communication with interactive robots is not yet possible, because we have not yet fully understood the mechanistic characteristics of the crossmodal binding between language, a...
Conference Paper
Full-text available
We present the robotic system IRMA (Interactive Robotic Memory Aid) that assists humans in their search for misplaced belongings within a natural home-like environment. Our stand-alone system integrates state-of-the-art approaches in a novel manner to achieve a seamless and intuitive human-robot interaction. IRMA directs its gaze toward the speaker...
Conference Paper
Full-text available
Recurrent Neural Networks (RNNs) are powerful architectures for sequence learning. Recent advances on the vanishing gradient problem have led to improved results and an increased research interest. Among recent proposals are architectural innovations that allow the emergence of multiple timescales during training. This paper explores a number of ar...
Thesis
Full-text available
The human brain is one of the most complex dynamic systems that enables us to communicate (and externalise) information by natural language. Our languages go far beyond single sounds for expressing intentions - in fact, human children already join discourse by the age of three. It is remarkable that in these first years they show a tremendous capab...
Chapter
Full-text available
How the human brain understands natural language and how we can exploit this understanding for building intelligent grounded language systems is open research. Recently, researchers claimed that language is embodied in most – if not all – sensory and sensorimotor modalities and that the brain’s architecture favours the emergence of language. In thi...
Conference Paper
Full-text available
Natural language processing in the human brain is complex and dynamic. Models for understanding, how the brain’s architecture acquires language, need to take into account the temporal dynamics of verbal utterances as well as of action and visual embodied perception. We propose an architecture based on three Multiple Timescale Recurrent Neural Netwo...
Conference Paper
Full-text available
Automatic speech recognition (ASR) technology has been developed to such a level that off-the-shelf distributed speech recognition services are available (free of cost), which allow researchers to integrate speech into their applications with little development effort or expert knowledge leading to better results compared with previously used open-...
Conference Paper
Full-text available
How the human brain understands natural language and what we can learn for intelligent systems is open research. Recently, researchers claimed that language is embodied in most – if not all – sensory and sensorimotor modalities and that the brain’s architecture favours the emergence of language. In this paper we investigate the characteristics of s...
Conference Paper
Full-text available
The development of humanoid robots for helping humans as well as for understanding the human cognitive system is of significant interest in science and technology. How to bridge the large gap between the needs of a natural humanrobot interaction and the capabilities of recent humanoid platforms is an important but open question. In this paper we de...
Conference Paper
Full-text available
Recent research has revealed that hierarchical linguistic struc-tures can emerge in a recurrent neural network with a sufficient number of delayed context layers. As a representative of this type of network the Multiple Timescale Recurrent Neural Network (MTRNN) has been proposed for recognising and generating known as well as unknown lin-guistic u...
Conference Paper
This paper presents a spiking neural network (SNN) for binaural sound source localisation (SSL). The cues used for SSL were the interaural time (ITD) and level (ILD) differences. ITDs and ILDs were extracted with models of the medial superior olive (MSO) and the lateral superior olive (LSO). The MSO and LSO outputs were integrated in a model of the...
Conference Paper
Full-text available
Robust speech recognition under noisy conditions like in human-robot interaction (HRI) in a natural environment often can only be achieved by relying on a headset and restricting the available set of utterances or the set of different speakers. Current automatic speech recognition (ASR) systems are commonly based on finite-state grammars (FSG) or s...
Conference Paper
Full-text available
Achieving cooperation among autonomous and rational agents is still a major challenge. In the past, altruistic cooperation was generally explained through genetic kinship relations. However, the theory of 'cultural kin' is an approach that tries to explain altruism through cultural relatedness. To promote cooperation among autonomous and rational a...
Conference Paper
In this paper we describe new results of statistical and neural data mining of audiology patient records, with the ultimate aim of looking for factors influencing which patients would most benefit from being fitted with a hearing aid. We describe how a combination of neural and statistical techniques can usefully subdivide a set of patients into cl...

Network

Cited By

Projects

Project (1)