About
31
Publications
6,369
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
341
Citations
Introduction
Development of audio processing algorithms for localization, separation and classification of sound sources. Current research interests include: semi-stationary noise filtering, reverb estimation and removal, voice activity detection, acoustic activity mapping, online source separation, beamforming, machine learning with audio signals, etc.
Additional affiliations
June 2018 - present
October 2014 - present
October 2014 - May 2018
Publications
Publications (31)
Gunshot detection and localization is a frontier technology in security systems. With an increasing rate of shootings globally, gunshot events and directional awareness are crucial for the law enforcement agencies for a timely response. This paper presents a real-time computational efficient gunshot detection and localization system. First, the per...
Beamforming is a type of audio array processing techniques used for interference reduction, sound source localization, and as pre-processing stage for audio event classification and speaker identification. The auditory scene analysis community can benefit from a systemic evaluation and comparison between different beamforming techniques. In this pa...
Although a significant amount of work has been carried out for visual perception in the context of unmanned aerial vehicles (UAVs), not so much has been done regarding auditory perception. The latter can complement the observation of the environment that surrounds a UAV by providing additional information that can be used to detect, classify, and l...
Verifying if two audio segments belong to the same speaker has been recently put forward as a flexible way to carry out speaker identification, since it does not require to be re-trained when new speakers appear on the auditory scene. Although many of the current techniques have achieved high performances, they require a considerably high amount of...
Online audio source separation has been an important part of auditory scene analysis and robot audition. The main type of technique to carry this out, because of its online capabilities, has been spatial filtering (or beamforming), where it is assumed that the location (mainly, the direction of arrival; DOA) of the source of interest (SOI) is known...
In this work, we address the problem of UAV detection flying nearby another UAV. Usually, computer vision could be used to face this problem by placing cameras onboard the patrolling UAV. However, visual processing is prone to false positives, sensible to light conditions and potentially slow if the image resolution is high. Thus, we propose to car...
Audio analysis over an Unmanned Aerial Systems (UAS) is of interest it is an essential step for on-board sound source localization and separation. This could be useful for search & rescue operations, as well as for detection of unauthorized drone operations. In this paper, an analysis of the previously introduced Acoustic Interactions for Robot Aud...
The Acoustic Interactions for Robot Audition corpus is introduced for research on sound source localization and separation, and for multi-user speech recognition. Its aim is to evaluate and train Robot Audition techniques, as well as Auditory Scene Analysis in general. It was recorded in six real-life environments with different noise presence and...
In this paper a strategy for incorporating a flexible and reliable high-level inference module in service robots is presented. This module is a part of the robot's cognitive architecture which coordinates perception, inference and action within the robot's communication and interaction cycle. The present approach relies on an explicit representatio...
Sound source localization (SSL) in a robotic platform has been essential in the overall scheme of robot audition. It allows a robot to locate a sound source by sound alone. It has an important impact on other robot audition modules, such as source separation, and it enriches human–robot interaction by complementing the robot’s perceptual capabiliti...
In this paper a Non-Monotonic Knowledge-Base (KB) for practical applications in service robots is presented. The KB is defined as a conceptual hierarchy with inheritance that supports the expression of defaults and exceptions. All classes and individuals, with their properties and relations, can be updated dynamically and the KB-System supports non...
The field of robot audition has blossomed into its own field throughout its 30 years of history. However, it is still not considered as an essential part of a robotic solution as other functionalities, such as navigation or manipulation. This paper presents some considerations of the overall current state of the field and proposes some ideas of how...
We present the use of direction of arrival (DOA) of sound sources as an index during the interaction between humans and service robots. These indices follow the notion defined by the theory of interpretation of signs by Peirce. This notion establishes a strong physical relation between signs (DOAs) and objects being signified in specific contexts....
Estimating the directions of arrival (DOAs) of multiple simultaneous mobile sound sources is an important step for various audio signal processing applications. In this contribution, we present an approach that improves upon our previous work that is now able to estimate the DOAs of multiple mobile speech sources, while being light in resources, bo...
Sound source localization is important in human interaction, such as in locating the origin of long-distance calls or facing other humans while in a conversation. It is of interest to apply such functionality to the core of human-robot interaction (HRI) and investigate its benefits, if any. In this paper, we propose three strategies for how to inte...
In this paper, we present a concept of service robot and a framework for its functional specification and implementation. The present discussion is grounded in Newell's system levels hierarchy which suggests organizing robotics research in three different layers, corresponding to Marr's computational, algorithmic and implementation levels, as follo...
Knowledge of how many users are there in the environment, and where they are located is essential for natural and efficient Human-Robot Interaction (HRI). However, carrying out the estimation of multiple Directions-of-Arrival (multi-DOA) on a mobile robotic platform involves a greater challenge as the mobility of the service robot needs to be consi...
A possible solution for the current rate of animal extinction in the world is the use of new technologies in their monitoring in order to tackle problems in the reduction of their populations in a timely manner. In this work we present a system for the identification of the Turdus migratorius bird species based on their singing. The core of the sys...
In this work, we present the speech recognition module of a service robot that performs various tasks, such as being a host party, receiving multiple commands or giving a tour guide. These tasks take place in diverse acoustic environments, e.g., a home or a supermarket, in which speech is one of the main modalities of interaction. Our approach reli...
In this paper we present SitLog: a declarative situation-oriented logical language for programming situated service robot tasks. The formalism is task and domain independent, and can be used in a wide variety of settings. SitLog can also be seen as a behaviour engineering specification and interpretation formalism to support action selection by auto...
Knowledge of how many users are there in the environment, and where they are located is essential for natural and efficient Human-Robot Interaction (HRI). However, carry- ing out the estimation of multiple Directions-of-Arrival (multi- DOA) on a mobile robotic platform involves a greater challenge as the mobility of the service robot needs to be co...
In this paper an interaction-oriented cognitive ar- chitecture for the specification and construction of situated systems and service robots is presented. The architecture is centered on an interaction mod- el, called dialogue model, with its corresponding program interpreter or Dialogue Manager. A dia- logue model represents the task structure of...
In this paper, we present the development of a tour–guide robot that conducts a poster session through spoken Spanish. The robot is able to navigate around its environment, visually identify informa-tional posters, and explain sections of the posters that users request via pointing gestures. We specify the task by means of dialogue models. A dialog...
The orientation of conversational robots to face their inter-locutors is essential for natural and efficient Human-Robot Interaction (HRI). In this paper, progress towards this objective is presented: a ser-vice robot able to detect the direction of a user, and orient itself towards him/her, in a complex auditive environment, using only voice and a...
A spectrum sampled from a material or product contains important and relevant information that can be used in many areas of Science and Engineering and is being used with greater frequency in the Industry, specifically in the areas of Quality Monitoring. Currently, many quality measurements are taken in an off-line manner which are costly and time-...
Frequency displacement, or spectral shift, is commonly observed in industrial spectral measurements. It can be caused by many factors such as sensor de-calibration or by external influences, which include changes in temperature. The presence of frequency displacement in spectral measurements can cause difficulties when statistical techniques, such...
Independent Component Analysis (ICA) is widely used for Blind Source Separation in generic spectrums which are themselves obtained from sensors that can be de-calibrated or are too sensitive to ambience changes. This usually results in frequency displacement or lag that ICA will face during its source extraction. Experiments were done that show tha...
The ability to use spectral data within a control loop is beginning to be considered in many areas, particularly in the Pharmaceutical Industry. However, typical spectral analysis tools, such as Classical Least Squares, are very fragile when handling frequency shifts which may occur in spectral measuring devices as a result of poor calibration or e...
Projects
Project (1)
The techniques that up to this point have been pushed forward by the Robot Audition community can be broadly divided into two camps, depending on the type of array used: many-microphone array or binaural. This project aims to find a balance between these two camps by exploring new approaches that aim to lower the number of microphones while maintaining comparable performance to the many-microphone-array techniques.