Hiroshi G. Okuno

Hiroshi G. Okuno
Waseda University | Sōdai · Institute for Human Robot Co-Creation

Ph.D

About

573
Publications
58,992
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,503
Citations
Introduction
Drone Audition, Robot Audition, and Computational Auditory Scene Analysis
Additional affiliations
April 2020 - present
Waseda University
Position
  • Researcher
January 2019 - present
Amity University
Position
  • Professor
April 2014 - present
Kyoto University
Position
  • Professor Emeritus

Publications

Publications (573)
Article
Full-text available
Bioacoustics monitoring has become increasingly popular for studying the behavior and ecology of vocalizing birds. This study aims to verify the practical effectiveness of localization technology for auditory monitoring of endangered Eurasian bittern (Botaurus stellaris) which inhabits wetlands in remote areas with thick vegetation. Their crepuscul...
Article
Full-text available
Continuum robots can enter narrow spaces and are useful for search and rescue missions in disaster sites. The exploration efficiency at disaster sites improves if the robots can simultaneously acquire several pieces of information. However, a continuum robot that can simultaneously acquire information to such an extent has not yet been designed. Th...
Preprint
1. This study is the first to quantitatively measure of the courtship flights of Latham’s snipe (Gallinago hardwickii), a migratory shorebird on the East Asian-Australasian Flyway, which is nearly threatened in their major breeding ground in Japan. 2. We localised the fine-scale movements of their display flights performed at high altitude and high...
Article
Sound source localization and separation with permutation resolution are essential for achieving a computational auditory scene analysis system that can extract useful information from a mixture of various sounds. Because existing methods cope separately with these problems despite their mutual dependence, the overall result with these approaches c...
Article
Full-text available
To understand the social interactions among songbirds, extracting the timing, position, and acoustic properties of their vocalizations is essential. We propose a framework for automatic and fine-scale extraction of spatial-spectral-temporal patterns of bird vocalizations in a densely populated environment. For this purpose, we used robot audition t...
Article
Full-text available
We developed a sound discrimination device to identify and localize the species of nocturnal animals in their natural habitat. The sound discrimination device is equipped with a microphone, a light-emitting diode, and a band-pass filter. By tuning the center frequency of the filter to include a dominant frequency of the calls of a focal species, we...
Article
Full-text available
Robot audition aims at developing robot's ears that work in the real world, that is, machine listening of multiple sound sources. Its critical problem is noise. Speech interfaces have become more familiar and more indispensable as smartphones and artificial intelligence (AI) speakers spread. Their critical problems are noise and multiple simultaneo...
Article
The role of robot technologies is growing particularly in the areas of disaster response, infrastructure inspection, and so on. For remote operation of the robots, wireless communication is important to control, send command, and receive telemetry signals. However, it could be often disconnected due to radio wave attenuation, shadowing, or interfer...
Preprint
Full-text available
A multi-rotor helicopter (hereinafter, drone) with sensors for scene analysis is expected to improve real-world tasks including search and rescue tasks. In addition to visual information, acoustic information obtained by sound source localization, position estimation, and separation is critical for conducting such urgent tasks in order to compensat...
Article
Full-text available
Ecoacoustics needs sophisticated acoustic monitoring tools to extract a wide level of features from an observed mixture of sounds. We have developed a portable acoustic monitoring system called ‘HARKBird’ which consists of a laptop PC and an inexpensive commercial microphone array with the robot audition software HARK. HARKBird can extract acoustic...
Article
Full-text available
Drone audition, or auditory processing for drones equipped with a microphone array, is expected to compensate for problems affecting drones‘ visual processing, in particular occlusion and poor-illumination conditions. The current state of drone audition still assumes a single sound source. When a drone hears sounds originating from multiple sound s...
Article
This paper presents a monaural speech enhancement method for a hose-shaped rescue robot based on a deep speech prior. Speech enhancement is crucial to make a robot operator succeed in detecting human voices because audio signals captured by a microphone on the robot are contaminated by ego-noise. We have been developed three enhancement methods: 1)...
Article
Statically balanced spring mechanisms are used in many applications that support our daily lives. However, creating new designs is a challenging problem since the designer has to simultaneously determine the right number of springs, their connectivity, attachment points, and other parameters. We propose a novel optimization-driven approach for desi...
Chapter
This chapter contains from Sects. 3.1 to 3.5. Section 3.1 describes firstly the definition of drones and recent trends. The important functions of the search and rescue flying robot are also generally described. And, Sect. 3.1 consists of an overview of R&D technologies of flying robot in Tough Robotics Challenge and a technical and general discuss...
Chapter
The Active Scope Camera has self-propelled mobility with a ciliary vibration drive mechanism for inspection tasks in narrow spaces but still lacks necessary mobility and sensing capabilities for search and rescue activities. The ImPACT-TRC program aims to improve the mobility of ASC drastically by applying a new air-jet actuation system to float AS...
Chapter
In the Tough Snake Robot Systems Group, a snake robot without wheels (nonwheeled-type snake robot) and a snake robot with active wheels (wheeled snake robot) have been developed. The main target applications of these snake robots are exploration of complex plant structures, such as the interior and exterior of pipes, debris, and even ladders, and t...
Article
We are developing snake robots as a solution for inspection of plants. The snake robots are constructed by connecting pitch axis and a yaw axis alternately. The snake robots realize various locomotion mode. Especially, helical rolling motion is utilized to move inside and outside of a pipe. In this paper, designed and system of the snake robots are...
Article
Posture estimation of a hose-shaped rescue robot is crucial for handling the flexible robot body. Conventional posture estimation based on inertial sensors gradually accumulates its errors due to unexpected posture change and temperature change. The accumulative error problem can be avoided by using a sound-based method that localizes microphones a...
Article
We have developed wheel-less snake robots and snake robots with active wheels in ImPACT TRC project. Applications of them are plant inspection and disaster response. In this paper development and future extension of snake-like robots are presented.
Article
Full-text available
We report on a simple and practical application of HARK, an easily available and portable system for bird song localization using an open-source software for robot audition HARK, to a deeper understanding of ecoacoustic dynamics of bird songs, focusing on a fine-scaled temporal analysis of song movement — song type dynamics in playback experiments....
Article
We have studied sound source localization, using a microphone array embedded on a UAV (unmanned aerial vehicle), for the purpose of detecting for people to rescue from disaster-stricken areas or other dangerous situations, and we have proposed sound source localization methods for use in outdoor environments. In these methods, noise robustness and...
Article
Full-text available
Acoustic interactions are important for understanding intra- and interspecific communication in songbird communities from the viewpoint of soundscape ecology. It has been suggested that birds may divide up sound space to increase communication efficiency in such a manner that they tend to avoid overlap with other birds when they sing. We are intere...
Article
Full-text available
Many animals use sounds produced by conspecifics for mate identification. Female insects and anuran amphibians, for instance, use acoustic cues to localize, orient toward and approach conspecific males prior to mating. Here we present a novel technique that utilizes multiple, distributed sound-indication devices and a miniature LED backpack to visu...
Article
We had developed an active scope camera: the robot video scope that can move by itself to probe narrow gaps for rescue missions. However, to investigate the interiors of collapsed houses effectively, not only the high mobility but also the sensors, such as vision, audition and haptics should be installed on the robot. According to this, we have dev...
Article
This paper presents a real-time human-voice enhancement method for a hose-shaped rescue robot based on multi-channel low-rank sparse decomposition. Although microphone arrays equipped on hose-shaped robots are crucial for finding victims under collapsed buildings, human voices captured by the microphone array are contaminated by environment-depende...
Article
We are developing a snake robot for using it in a real plant to inspect pipes in the plant. The snake robot is constructed by connecting pitch axis and yaw axis alternately. The snake robot can move the inside and outside of a pipe with helical rolling motion. We integrated the snake robot with tactile pressure sensors and a sound-based localizatio...
Article
This paper presents a blind multichannel speech enhancement method that can deal with the time-varying layout of microphones and sound sources. Since non-negative tensor factorization (NTF) separates a multichannel magnitude (or power) spectrogram into source spectrograms without phase information, it is robust against the time-varying mixing syste...
Article
Full-text available
In search and rescue activities, unmanned aerial vehicles (UAV) should exploit sound information to compensate for poor visual information. This paper describes the design and implementation of a UAV-embedded microphone array system for sound source localization in outdoor environments. Four critical development problems included water-resistance o...
Conference Paper
This paper addresses online outdoor sound source localization using a microphone array embedded in an unmanned aerial vehicle (UAV). In addition to sound source localization, sound source enhancement and robust communication method are also described. This system is one instance of deployment of our continuously developing open source software for...
Article
Robot audition is a research field that focuses on developing technologies so that robots can hear sound through their own ears (microphones). By compiling robot audition studies performed over more than 10 years, open source software for research purposes called HARK (Honda Research Institute Japan Audition for Robots with Kyoto University) was re...
Article
Robot audition, the ability of a robot to listen to several things at once with its own “ears,” is crucial to the improvement of interactions and symbiosis between humans and robots. Since robot audition was originally proposed and has been pioneered by Japanese research groups, this special issue on robot audition technologies of the Journal of Ro...
Article
[abstFig src='/00290001/23.jpg' width='300' text='Calling behavior of a male Japanese Tree Frog' ] Sensing the external environment is a core function of robots and autonomous mechanics. This function is useful for monitoring and analyzing the ecosystem for our deeper understanding of the nature and accomplishing the sustainable ecosystem. Here, we...
Article
This paper presents the design and implementation of a two-stage human-voice enhancement system for a hose-shaped rescue robot. When a microphoneequipped hose-shaped robot is used to search for a victim under a collapsed building, human-voice enhancement is crucial because the sound captured by a microphone array is contaminated by the ego-noise of...
Article
While many robots have been developed to monitor environments, most studies are dedicated to navigation and locomotion and use off-the-shelf sensors. We focus on a novel acoustic device and its processing software, which is designed for a swarm of environmental monitoring robots equipped with the device. This paper demonstrates that a swarm of moni...
Article
Full-text available
Understanding auditory scenes is important when deploying intelligent robots and systems in real-world environments. We believe that robot audition can better recognize acoustic events in the field as compared to conventional methods such as human observation or recording using single-channel microphone array. We are particularly interested in acou...
Article
Two major functions, sound source localization and sound source separation, provided by robot audition open source software HARK exploit the acoustic transfer functions of a microphone array to improve the performance. The acoustic transfer functions are calculated from the measured acoustic impulse response. In the measurement, special signals suc...
Article
We have developed a self-propelling robotic pet, in which the robot audition software HARK (Honda Research Institute Japan Audition for Robots with Kyoto University) was installed to equip it with sound source localization functions, thus enabling it to move in the direction of sound sources. The developed robot, which is not installed with cameras...
Article
This paper reports the results of our field test of HARKBird, a portable system that consists of robot audition, a laptop PC, and omnidirectional microphone arrays. We assessed its localization accuracy to monitor songs of the great reed warbler (Acrocephalus arundinaceus) in time and two-dimensional space by comparing locational and temporal data...
Article
This paper presents an online method that estimates a 3D posture of a hose-shaped rescue robot using a microphone and accelerometer array. Posture (shape) estimation of a self-driving hose-shaped rescue robot is crucial for handling the robot body because the unseen robot posture deforms in narrow spaces under collapsed buildings. Conventional soun...
Article
The ability of robots to listen to several things at once with their own “ears”, i.e., robot audition, is critical in improving the performance of search and rescue activities under severe conditions. This paper introduces “HARK” robot audition open-source software and its capabilities of suppressing ego-noise that is caused by robot’s own movement...
Article
Full-text available
CAPTCHAs distinguish humans from automated programs by presenting questions that are easy for humans but difficult for computers, e.g., recognition of visual characters or audio utterances. The state of the art research suggests that the security of visual and audio CAPTCHAs mainly lies in anti-segmentation techniques, because individual symbol rec...
Conference Paper
This paper presents an online real-time method that enhances human voices included in severely noisy audio signals captured by microphones of a hose-shaped rescue robot. To help a remote operator of such a robot pick up a weak voice of a human buried under rubble, it is crucial to suppress the loud ego-noise caused by the movements of the robot in...
Article
This paper presents an interactive quizmaster robot that can manage a multiparty speech-based quiz game. The basic flow of the quiz game is that (1) the robot reads a question, (2) one or more players answer it, and (3) the robot judges the correctness of the answers. We categorize such speech-based quiz games into school-type interaction and aucti...
Article
Full-text available
The ability of robots to listen to several things at once with their own 'ears', that is, robot audition, is an important factor in improving interaction and symbiosis between humans and robots. The critical issue in robot audition is real-time processing and robustness against noisy environments with high flexibility to support various kinds of ro...
Article
Intrinsic motivation is one of the keys in implementing the mechanism of interest to robots. In this paper, we present a method to apply intrinsic motivation in dynamics learning with predictable and unpredictable targets in view. The robot’s arm is used for the predictable target and the human’s arm is used for the unpredictable target in the expe...
Article
Dance movement is intrinsically connected to the rhythm of music and is a fundamental form of nonverbal communication present in daily human interactions. In order to enable robots to interact with humans in natural real-world environments through dance, these robots must be able to listen to music while robustly tracking the beat of continuous mus...
Patent
A reverberation suppressing apparatus, includes: a sound acquiring unit which acquires a sound signal; a reverberation data computing unit which computes reverberation data from the acquired sound signal; a reverberation characteristics estimating unit which estimates reverberation characteristics based on the computed reverberation data; a filter...
Patent
A first domain satisfying a first condition concerning a current utterance understanding result and a second domain satisfying a second condition concerning a selection history are specified. For each of the first and second domains, indices representing reliability in consideration of the utterance understanding history, selection history, and utt...
Article
In this paper, we present a method to apply intrinsic motivation for improving visuomotor learning of robot's arm with external object in view. Multiple Timescales Recurrent Neural Network (MTRNN) is utilized for learning the robot arm/external object dynamics. Training of MTRNN is done using the Back Propagation Through Time (BPTT) algorithm. BPTT...
Article
This article presents an offline method for aligning an audio signal to individual instrumental parts constituting a musical score. The proposed method is based on fitting multiple hidden semi-Markov models (HSMMs) to the observed audio signal. The emission probability of each state of the HSMM is described using latent harmonic allocation (LHA), a...
Article
This paper presents an interactive humanoid robot that can moderate a multi-player fastest-voice-first-type quiz game by leveraging state-of-the-art robot audition techniques such as sound source localization and separation and speech recognition. In this game, a player who says 'Yes' first gets a right to answer a question, and players are allowed...
Article
This paper presents an automatic speech recognition (ASR) system that accepts a mixture of various kinds of dialects. The system recognizes dialect utterances on the basis of the statistical simulation of vocabulary transformation and combinations of several dialect models. Previous dialect ASR systems were based on handcrafted dictionaries for sev...
Article
This paper presents a robot quizmaster that has auditory functions (i.e., ears) for moderating a multiplayer quiz game. The most basic form of oral interaction in a quiz game is that a quizmaster reads aloud a question, and each player is allowed to answer it whenever the answer comes to his or her mind. A critical problem in such oral interaction...