Axel Plinge

Axel Plinge
Fraunhofer Institute for Integrated Circuits IIS | IIS · Department of Precise Positioning and Analytics

Dr.-Ing.

About

71
Publications
14,952
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
468
Citations
Citations since 2016
39 Research Items
391 Citations
20162017201820192020202120220102030405060
20162017201820192020202120220102030405060
20162017201820192020202120220102030405060
20162017201820192020202120220102030405060
Introduction
My research interests include the pragmatic integration of psychophysical research in practical solutions, assistive technology, the modelling of the sensory processing in the human brain and the philosophy of science.
Additional affiliations
June 2020 - November 2020
Fraunhofer Institute for Integrated Circuits IIS
Position
  • Head of Department
January 2017 - May 2020
Fraunhofer Institute for Integrated Circuits IIS
Position
  • Researcher
August 2015 - March 2016
Bar Ilan University
Position
  • Visiting Researcher
Description
  • By combination of the separate lines of research on speaker tracking, sound classification, and speech enhancement a novel framework is created.
Education
June 2012 - November 2016
Technische Universität Dortmund
Field of study
  • Computer Science
March 2005 - May 2010
Technische Universität Dortmund
Field of study
  • Computer Science

Publications

Publications (71)
Conference Paper
Full-text available
Tracking multiple speakers with microphone arrays is one of the key tasks in smart environments. For good accuracy in reverberant environments, several arrays should be distributed in the room. The method presented is using distributed nodes with microphone arrays that compute local angular speech detections. In an integrating node, these are assoc...
Article
Today, we are often surrounded by devices with one or more microphones, such as smartphones, laptops, and wireless microphones. If they are part of an acoustic sensor network, their distribution in the environment can be beneficially exploited for various speech processing tasks. However, applications like speaker localization, speaker tracking, an...
Article
The detection and classification of acoustic events in various environments is an important task. Its applications range from multimedia analysis to surveillance of humans or even animal life. Several of these tasks require the capability of online processing. Besides many approaches that tackle the task of acoustic event detection, methods that ar...
Presentation
Full-text available
Deep neural networks (DNNs) have become state-of-the-art for a wide range of applications including computer vision, speech recognition, and robotics. The superior performance often comes at the cost of high computational complexity. The process of creating and training a DNN model is difficult and labor-intense, and the resulting models rarely opt...
Preprint
Quantum reinforcement learning is an emerging field at the intersection of quantum computing and machine learning. While we intend to provide a broad overview of the literature on quantum reinforcement learning (our interpretation of this term will be clarified below), we put particular emphasis on recent developments. With a focus on already avail...
Preprint
Beamforming-capable antenna arrays overcome the high free-space path loss at higher carrier frequencies. However, the beams must be properly aligned to ensure that the highest power is radiated towards (and received by) the user equipment (UE). While there are methods that improve upon an exhaustive search for optimal beams by some form of hierarch...
Article
Full-text available
Deep Reinforcement Learning (RL) has considerably advanced over the past decade. At the same time, state-of-the-art RL algorithms require a large computational budget in terms of training time to converge. Recent work has started to approach this problem through the lens of quantum computing, which promises theoretical speed-ups for several traditi...
Preprint
Reinforcement learning (RL) has shown to reach super human-level performance across a wide range of tasks. However, unlike supervised machine learning, learning strategies that generalize well to a wide range of situations remains one of the most challenging problems for real-world RL. Autonomous driving (AD) provides a multi-faceted experimental f...
Preprint
The data representation in a machine-learning model strongly influences its performance. This becomes even more important for quantum machine learning models implemented on noisy intermediate scale quantum (NISQ) devices. Encoding high dimensional data into a quantum circuit for a NISQ device without any loss of information is not trivial and bring...
Preprint
Autonomous driving has the potential to revolutionize mobility and is hence an active area of research. In practice, the behavior of autonomous vehicles must be acceptable, i.e., efficient, safe, and interpretable. While vanilla reinforcement learning (RL) finds performant behavioral strategies, they are often unsafe and uninterpretable. Safety is...
Preprint
Many scenarios in mobility and traffic involve multiple different agents that need to cooperate to find a joint solution. Recent advances in behavioral planning use Reinforcement Learning to find effective and performant behavior strategies. However, as autonomous vehicles and vehicle-to-X communications become more mature, solutions that only util...
Preprint
Full-text available
Deep Reinforcement Learning (RL) has considerably advanced over the past decade. At the same time, state-of-the-art RL algorithms require a large computational budget in terms of training time to converge. Recent work has started to approach this problem through the lens of quantum computing, which promises theoretical speed-ups for several traditi...
Conference Paper
Safe and efficient behavior are the key guiding principles for autonomous vehicles. Manually designed rule-based systems need to act very conservatively to ensure a safe operation. This limits their applicability to real-world systems. On the other hand, more advanced behaviors, i.e., policies, learned through means of reinforcement learning (RL) s...
Conference Paper
Full-text available
Autonomous driving is an active field of research in academia and industry. On the way to the ambitious goal of fully autonomous driving, many in the development of Advanced Driver Assistance Systems (ADASs) address the problem of driving assistance in difficult situations. We embarked on the adventure of using reinforcement learning to design such...
Conference Paper
Full-text available
Given the presence of deep neural networks (DNNs) in all kinds of applications, the question of optimized deployment is becoming increasingly important. One important step is the automated size reduction of the model footprint. Of all the methods emerging, post-training quantization is one of the simplest to apply. Without needing long processing o...
Conference Paper
Full-text available
Individual head-related transfer functions (HRTFs) improve localization accuracy and externalization in binaural audio reproduction compared to generic HRTFs. Listening tests are often conducted using generic HRTFs due to the difficulty of obtaining individual HRTFs for all participants. This study explores the ramifications of the choice of HRTFs...
Conference Paper
Full-text available
The immersion of the user is of key interest in the reproduction of acoustic scenes in virtual reality. It is enhanced when movement is possible in six degrees-of-freedom, i.e., three rotational plus three translational degrees. Further enhancement of immersion can be achieved when the user is not only able to move between distant sound sources, bu...
Conference Paper
Full-text available
Auditory localization cues in the near-field (< 1.0 m) are significantly different than in the far-field. The near-field region is within an arm's length of the listener allowing to integrate proprioceptive cues to determine the location of an object in space. This perceptual study compares three non-individualized methods to apply head-related tra...
Conference Paper
Full-text available
First-order Ambisonics (FOA) recordings can be processed and reproduced over headphones. They can be rotated to account for the listener's head orientation. However, virtual reality (VR) systems allow the listener to move in six-degrees-of-freedom (6DoF), i.e., three rotational plus three transitional degrees of freedom. Here, the apparent angles a...
Conference Paper
Full-text available
Virtual reality systems with multimodal stimulation and up to six degrees-of-freedom movement pose novel challenges to audio quality evaluation. This paper adapts classic multiple stimulus test methodology to virtual reality and adds behavioral tracking functionality. The method is based on ranking by elimination while exploring an audiovisual virt...
Article
Localization of acoustic sources has attracted a considerable amount of research attention in recent years. A major obstacle to achieving high localization accuracy is the presence of reverberation, the influence of which obviously increases with the number of active speakers in the room. Human hearing is capable of localizing acoustic sources even...
Conference Paper
This paper proposes a method for evaluating real-time binaural reproduction systems by means of a wayfinding task in six degrees of freedom. Participants physically walk to sound objects in a virtual reality created by a head-mounted display and binaural audio. The method allows for comparative evaluation of different rendering and tracking systems...
Conference Paper
A novel approach to calibrate the geometry of microphones using a single sound event is proposed. A variant of the expectation-maximization algorithm is employed to estimate the spatial coherence matrix of the reverberant sound field directly from the microphone signals. By matching the spatial coherence to theoretical models, the pairwise micropho...
Thesis
In the modern world, we are increasingly surrounded by computation devices with communication links and one or more microphones. Such devices are, for example, smartphones, tablets, laptops or hearing aids. These devices can work together as nodes in an acoustic sensor network (ASN). Such networks are a growing platform that opens the possibility f...
Conference Paper
This paper proposes a new method of evaluating real-time binaural reproduction systems by means of a wayfinding task in six degrees of freedom (6 DoF). Participants physically walk to sound objects in a virtual reality created by a head mounted display and binaural audio. We show how the quality of spatial audio rendering is reflected by objective...
Article
As we are surrounded by an increased number of mobile devices equipped with wireless links and multiple microphones, e.g., smartphones, tablets, laptops and hearing aids, using them collaboratively for acoustic processing is a promising platform for emerging applications. These devices make up an acoustic sensor network comprised of nodes, i.e. dis...
Conference Paper
Full-text available
In this paper a novel approach for acoustic event detection in sensor networks is presented. Improved and more robust recognition is achieved by making use of the signals from multiple sensors. To this end, various known fusion strategies are evaluated along with a novel method using classifier stacking. A comparative evaluation of these fusion str...
Conference Paper
Full-text available
A multitude of multi-microphone speech enhancement methods is available. In this paper, we focus our attention to the well-known minimum variance distortionless response (MVDR) beamformer, due to its ability to preserve distortionless response towards the desired speaker while minimizing the output noise power. We explore two alternatives for const...
Presentation
Full-text available
A short introduction how to install and use python for audio research
Conference Paper
Full-text available
The Bag-of-Features principle proved successful in many pattern recognition tasks ranging from document analysis and image classification to gesture recognition and even forensic applications. Lately these methods emerged in the field of acoustic event detection and showed very promising results. The detection and classification of acoustic events...
Conference Paper
Full-text available
Acoustic Event Detection is the task of recognition of the type and temporal extent of an acoustic event. A novel approach for classifying acoustic events that is based on a Bag-of-Features approach is described. Mel and gammatone frequency cepstral coefficients that originate from psychoacoustic models are used as input features for the Bag-of rep...
Conference Paper
Full-text available
Bregmans Auditory Scene Analysis (ASA) theory of human auditory perception was able to integrate sensory, neurological and psychological research into a coherent picture explaining how humans are able to create a rich and pragmatic representation of the outside world by the movement of two eardrums. With the advent of modern computing power, severa...
Conference Paper
Full-text available
Microphone arrays can be used for a number of applications such as speaker diarization and tracking. For these, it is necessary to calibrate their geometry with good precision. Manual measurement is cumbersome and impractical for ad hoc configurations as distributed sensor nodes. So an fast automated calibration method that provides sufficient accu...
Conference Paper
Full-text available
Smart rooms are used for a growing number of practical ap-plications. They are often equipped with microphones and cameras allowing acoustic and visual tracking of persons. For that, the geometry of the sensors has to be calibrated. In this paper, a method is introduced that calibrates the microphone arrays by using the visual localization of a spe...
Conference Paper
Full-text available
The classification of acoustic events in indoor environments is an important task for many practical applications in smart environments. In this paper a novel approach for classifying acoustic events that is based on a Bag-of-Features approach is proposed. Mel and gammatone frequency cepstral coeffi-cients that originate from psychoacoustic models...
Data
Full-text available
Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired realtime system using multiple distributed nodes with small circular microphone arrays is designed to accomplish this task. Each node localizes speakers with a dedicated cochlear and midbrain model. Sparse angular localizations and their spectra are transm...
Conference Paper
Tracking speakers is one of the key tasks in smart environments. A neurobiologically inspired realtime system using multiple distributed nodes with small circular microphone arrays is designed to accomplish this task. Each node localizes speakers with a dedicated cochlear and midbrain model. Sparse angular localizations and their spectra are transm...
Conference Paper
Full-text available
Tracking multiple speakers with microphone arrays is used for practical applications such as video conferencing. An important task is the integration of multiple arrays with correct associations of multiple concurrent speakers. A single-array tracking approach based on CASA is extended here to probabilistic tracking with multiple arrays in order to...
Conference Paper
Full-text available
Online tracking of speakers is an important task for applica-tions in smart environments such as camera control, meet-ing annotation and speech separation. Challenges for an audio-only system are small-room reverberation, noise, the unknown number of speakers, and gaps occurring in natural speech. Combining models from neurobiology and cognitive ps...
Article
The aim of this study was to investigate the spatial orienting of visual attention in depth under purely stereoscopic viewing conditions. Random-dot stereograms were used to present disparity-defined target stimuli that were either validly or invalidly cued in depth. In separate tasks, participants responded either to the relative depth of the targ...
Conference Paper
Full-text available
Tracking speakers is an important application in smart environments. Acoustic tracking using microphone arrays is a challenging task due to two major reasons: On the one hand, multiple persons may speak simultaneously and thus the number of speakers varies over time; on the other hand, due to the nature of reverberated speech, the provided position...
Conference Paper
Full-text available
A major application area of microphone array processing is the localization of sound sources, mainly of speaking persons. In contrast to most state-of-the-art approaches that are based on correlation measures, we propose a neurologically inspired system that generalizes findings about human spatial hearing to the multi-channel case. It mimics the p...
Patent
The invention relates to a method for processing acoustic voice signals using an electronic processing device. According to the invention, in order to provide an improved processing of acoustic voice signals relative to the state of the art, a processing according to the type of sound is carried out, wherein poorly articulated sounds are lengthened...
Conference Paper
Full-text available
Modern digital hearing aids provide unprecedented means of compensating for hearing impairments. However, this comes at the price of adjusting complex processing parameters. To achieve an individual optimum, all processing parameters have to be carefully pre- and re-adjusted. The required time and effort can often not be made available. To provide...
Article
Full-text available
People who suffer from severe auditory high-tone losses would greatly benefit from replacement or selective enhancement of speech elements or speech features. We have tested two basically different technical methods of frequency transformation that are suitable for implementation on wearable DSP-add-on equip- ment in a C++ simulation: The first pro...
Article
Full-text available
For several years, we have been working on means to improve speech re- ception for severely sensory hearing-impaired persons. The work done includes algo- rithms for non-linear speech processing as well as phoneme spotting and transposition. The overall goal is to implement some of these algorithms into low-power DSPs in a wearable device. Here we...
Conference Paper
We start by considering the needs of users who until now cannot receive badly needed technical compensation for their auditory deficit - because of a lack of efforts to develop special assistive devices for the smaller populations within mainstream industrial development: These patients are characterised by the existence of severe sensory deficits;...
Conference Paper
Full-text available
Hearing impaired people with severe sensory deficit urgently need a perception-based replacement for inaudible fricational features of /s, z, C, t/ (beyond NR and speech enhancement) - to restore high-level breakdown of speech connectedness. Today, shortcomings of past designs can be overcome by digital processing, introducing naturalness and selec...
Article
Full-text available
Sensory hearing-impaired people with severe auditory deficit are neither candidate for cochlear implants nor can they benefit sufficiently from conventional hearing aids. Since they usually suffer from total loss of reception of components from the higher spectrum, but may have sufficient residual sensitivity and selectivity below 1.5 kHz to receiv...
Article
Full-text available
Modern communication technology - if not adapted to the needs of the severely hearing impaired person - leads to the exclusion from everyday communication. Only if it is well adapted it may offer a higher degree of freedom and integration.
Article
Full-text available
For recovery of speech intelligibility, severely hearing impaired people with a near-total loss of sensitivity at frequencies above 2KHz urgently need a replacement of perceptional missing frication speech features. Depending on the damage, /s/, frication of /z/, //, and sometimes /t/ and /k/ are important candidates for replacement. A lot of effor...

Network

Cited By

Projects

Projects (9)
Project
Zur Vernetzung von Forschung und Wirtschaft hat das Fraunhofer-Institut für Integrierte Schaltungen IIS in Kooperation mit der Friedrich-Alexander-Universität Erlangen-Nürnberg und der Ludwig-Maximilians-Universität München unter weiterer Beteiligung der Fraunhofer-Institute IKS und IISB eine einzigartige Forschungsinfrastruktur in Bayern geschaffen: das ADA Lovelace Center for Analytics, Data and Applications. Es verbindet als Kooperationsplattform für Wissenschaft und Wirtschaft auf innovative Art KI-Forschung mit KI-Anwendungen. Das Besondere am ADA Lovelace Center ist tatsächlich die enge Verbindung aus Forschung und industrieller Anwendung: Mit der Art und Weise, wie wir unsere Kompetenzen, Methoden und Verfahren aus dem Bereich der Künstlichen Intelligenz in den Fragestellungen der Praxis einsetzen und an ihnen weiterentwickeln, wie wir an Projekte herangehen und wie wir mit unseren Partnern aus Industrie und Wissenschaft zusammenarbeiten, wollen wir den Zugang der Unternehmen zu umfassender KI-Expertise erleichtern und so schnell konkreten Nutzen für sie herstellen; und zwar über die Grenze des derzeit Machbaren hinaus. So wollen wir Unternehmen vom enormen Potenzial von KI überzeugen und KI in die industrielle Anwendung bringen: Das ADA Lovelace Center versteht sich als Multiplikator, um KI-Kompetenz in einem Unternehmen aufzubauen oder die vorhandene KI-Kompetenz zu stärken bzw. weiterzuentwickeln. Das wissenschaftliche Methodenspektrum, das wir im ADA Lovelace Center dafür einsetzen, ist sehr breit und die Auswahl der richtigen Methode abhängig vom Anwendungsfall. Immer aber geht es bei uns um Datenanalyse: von der klassischen Zustandsbeschreibung über Vorhersagen von Ereignissen bis hin zu entscheidungsbasierten Methoden, die z.B. automatisiert eine bestimmte Handlung auslösen sollen. Dazu binden wir übrigens neben den genannten regionalen Playern auch viele andere nationale und internationale Wissenschaftspartner ein. An folgenden Kompetenzsäulen wird geforscht: • Automatisches Lernen • Sequenzbasiertes Lernen • Erfahrungsbasiertes Lernen • Few Labels Learning • Erklärbares Lernen • Mathematische Optimierung • Semantik • Few Data Learning Eine Beschreibung zu den oben genannten Kompetenzsäulen kann unter diesem Link https://www.scs.fraunhofer.de/de/referenzen/ada-center.html#381039553 gefunden werden.
Project
Quantum computing can be used to bring machine learning to a new level. We explore, among other approaches, reinforcement learning algorithms in quantum machine learning.
Project
We develop advanced machine learning methodologies for fully autonomous agents. These include reinforcement learning for, e.g., driving assistance. While being autonomous, there is a clear focus on explainability. We employ, among other things, rule extraction to make the learned results transparent and controllable.