Caleb Rascon

Caleb Rascon
Universidad Nacional Autónoma de México | UNAM · Instituto de Investigaciones en Matemáticas Aplicadas y Sistemas

Ph.D.

About

31
Publications
6,369
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
341
Citations
Introduction
Development of audio processing algorithms for localization, separation and classification of sound sources. Current research interests include: semi-stationary noise filtering, reverb estimation and removal, voice activity detection, acoustic activity mapping, online source separation, beamforming, machine learning with audio signals, etc.
Additional affiliations
June 2018 - present
Universidad Nacional Autónoma de México
Position
  • Research Associate
Description
  • Auditory Scene Analysis
October 2014 - present
Universidad Nacional Autónoma de México
Position
  • Postgraduate Course Teacher
Description
  • Robot Audition
October 2014 - May 2018
Universidad Nacional Autónoma de México
Position
  • Fellow
Description
  • Robot Audition

Publications

Publications (31)
Article
Gunshot detection and localization is a frontier technology in security systems. With an increasing rate of shootings globally, gunshot events and directional awareness are crucial for the law enforcement agencies for a timely response. This paper presents a real-time computational efficient gunshot detection and localization system. First, the per...
Article
Full-text available
Beamforming is a type of audio array processing techniques used for interference reduction, sound source localization, and as pre-processing stage for audio event classification and speaker identification. The auditory scene analysis community can benefit from a systemic evaluation and comparison between different beamforming techniques. In this pa...
Article
Full-text available
Although a significant amount of work has been carried out for visual perception in the context of unmanned aerial vehicles (UAVs), not so much has been done regarding auditory perception. The latter can complement the observation of the environment that surrounds a UAV by providing additional information that can be used to detect, classify, and l...
Article
Verifying if two audio segments belong to the same speaker has been recently put forward as a flexible way to carry out speaker identification, since it does not require to be re-trained when new speakers appear on the auditory scene. Although many of the current techniques have achieved high performances, they require a considerably high amount of...
Article
Full-text available
Online audio source separation has been an important part of auditory scene analysis and robot audition. The main type of technique to carry this out, because of its online capabilities, has been spatial filtering (or beamforming), where it is assumed that the location (mainly, the direction of arrival; DOA) of the source of interest (SOI) is known...
Article
Full-text available
In this work, we address the problem of UAV detection flying nearby another UAV. Usually, computer vision could be used to face this problem by placing cameras onboard the patrolling UAV. However, visual processing is prone to false positives, sensible to light conditions and potentially slow if the image resolution is high. Thus, we propose to car...
Article
Full-text available
Audio analysis over an Unmanned Aerial Systems (UAS) is of interest it is an essential step for on-board sound source localization and separation. This could be useful for search & rescue operations, as well as for detection of unauthorized drone operations. In this paper, an analysis of the previously introduced Acoustic Interactions for Robot Aud...
Article
Full-text available
The Acoustic Interactions for Robot Audition corpus is introduced for research on sound source localization and separation, and for multi-user speech recognition. Its aim is to evaluate and train Robot Audition techniques, as well as Auditory Scene Analysis in general. It was recorded in six real-life environments with different noise presence and...
Article
In this paper a strategy for incorporating a flexible and reliable high-level inference module in service robots is presented. This module is a part of the robot's cognitive architecture which coordinates perception, inference and action within the robot's communication and interaction cycle. The present approach relies on an explicit representatio...
Article
Full-text available
Sound source localization (SSL) in a robotic platform has been essential in the overall scheme of robot audition. It allows a robot to locate a sound source by sound alone. It has an important impact on other robot audition modules, such as source separation, and it enriches human–robot interaction by complementing the robot’s perceptual capabiliti...
Article
Full-text available
In this paper a Non-Monotonic Knowledge-Base (KB) for practical applications in service robots is presented. The KB is defined as a conceptual hierarchy with inheritance that supports the expression of defaults and exceptions. All classes and individuals, with their properties and relations, can be updated dynamically and the KB-System supports non...
Article
Full-text available
The field of robot audition has blossomed into its own field throughout its 30 years of history. However, it is still not considered as an essential part of a robotic solution as other functionalities, such as navigation or manipulation. This paper presents some considerations of the overall current state of the field and proposes some ideas of how...
Article
Full-text available
We present the use of direction of arrival (DOA) of sound sources as an index during the interaction between humans and service robots. These indices follow the notion defined by the theory of interpretation of signs by Peirce. This notion establishes a strong physical relation between signs (DOAs) and objects being signified in specific contexts....
Article
Full-text available
Estimating the directions of arrival (DOAs) of multiple simultaneous mobile sound sources is an important step for various audio signal processing applications. In this contribution, we present an approach that improves upon our previous work that is now able to estimate the DOAs of multiple mobile speech sources, while being light in resources, bo...
Article
Full-text available
Sound source localization is important in human interaction, such as in locating the origin of long-distance calls or facing other humans while in a conversation. It is of interest to apply such functionality to the core of human-robot interaction (HRI) and investigate its benefits, if any. In this paper, we propose three strategies for how to inte...
Article
Full-text available
In this paper, we present a concept of service robot and a framework for its functional specification and implementation. The present discussion is grounded in Newell's system levels hierarchy which suggests organizing robotics research in three different layers, corresponding to Marr's computational, algorithmic and implementation levels, as follo...
Chapter
Full-text available
Knowledge of how many users are there in the environment, and where they are located is essential for natural and efficient Human-Robot Interaction (HRI). However, carrying out the estimation of multiple Directions-of-Arrival (multi-DOA) on a mobile robotic platform involves a greater challenge as the mobility of the service robot needs to be consi...
Conference Paper
Full-text available
A possible solution for the current rate of animal extinction in the world is the use of new technologies in their monitoring in order to tackle problems in the reduction of their populations in a timely manner. In this work we present a system for the identification of the Turdus migratorius bird species based on their singing. The core of the sys...
Conference Paper
Full-text available
In this work, we present the speech recognition module of a service robot that performs various tasks, such as being a host party, receiving multiple commands or giving a tour guide. These tasks take place in diverse acoustic environments, e.g., a home or a supermarket, in which speech is one of the main modalities of interaction. Our approach reli...
Article
Full-text available
In this paper we present SitLog: a declarative situation-oriented logical language for programming situated service robot tasks. The formalism is task and domain independent, and can be used in a wide variety of settings. SitLog can also be seen as a behaviour engineering specification and interpretation formalism to support action selection by auto...
Article
Full-text available
Knowledge of how many users are there in the environment, and where they are located is essential for natural and efficient Human-Robot Interaction (HRI). However, carry- ing out the estimation of multiple Directions-of-Arrival (multi- DOA) on a mobile robotic platform involves a greater challenge as the mobility of the service robot needs to be co...
Article
Full-text available
In this paper an interaction-oriented cognitive ar- chitecture for the specification and construction of situated systems and service robots is presented. The architecture is centered on an interaction mod- el, called dialogue model, with its corresponding program interpreter or Dialogue Manager. A dia- logue model represents the task structure of...
Conference Paper
Full-text available
In this paper, we present the development of a tour–guide robot that conducts a poster session through spoken Spanish. The robot is able to navigate around its environment, visually identify informa-tional posters, and explain sections of the posters that users request via pointing gestures. We specify the task by means of dialogue models. A dialog...
Conference Paper
Full-text available
The orientation of conversational robots to face their inter-locutors is essential for natural and efficient Human-Robot Interaction (HRI). In this paper, progress towards this objective is presented: a ser-vice robot able to detect the direction of a user, and orient itself towards him/her, in a complex auditive environment, using only voice and a...
Thesis
Full-text available
A spectrum sampled from a material or product contains important and relevant information that can be used in many areas of Science and Engineering and is being used with greater frequency in the Industry, specifically in the areas of Quality Monitoring. Currently, many quality measurements are taken in an off-line manner which are costly and time-...
Article
Full-text available
Frequency displacement, or spectral shift, is commonly observed in industrial spectral measurements. It can be caused by many factors such as sensor de-calibration or by external influences, which include changes in temperature. The presence of frequency displacement in spectral measurements can cause difficulties when statistical techniques, such...
Conference Paper
Full-text available
Independent Component Analysis (ICA) is widely used for Blind Source Separation in generic spectrums which are themselves obtained from sensors that can be de-calibrated or are too sensitive to ambience changes. This usually results in frequency displacement or lag that ICA will face during its source extraction. Experiments were done that show tha...
Conference Paper
Full-text available
The ability to use spectral data within a control loop is beginning to be considered in many areas, particularly in the Pharmaceutical Industry. However, typical spectral analysis tools, such as Classical Least Squares, are very fragile when handling frequency shifts which may occur in spectral measuring devices as a result of poor calibration or e...

Projects

Project (1)
Project
The techniques that up to this point have been pushed forward by the Robot Audition community can be broadly divided into two camps, depending on the type of array used: many-microphone array or binaural. This project aims to find a balance between these two camps by exploring new approaches that aim to lower the number of microphones while maintaining comparable performance to the many-microphone-array techniques.