
Tatsuya HiraharaToyama Prefectural University
Tatsuya Hirahara
Doctor of Engineering
About
93
Publications
15,532
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
730
Citations
Citations since 2017
Introduction
Additional affiliations
April 2006 - March 2021
January 2004 - March 2006
ATR Human Information Science Laboratories
Position
- Managing Director
July 2000 - December 2003
Publications
Publications (93)
This article describes a linear microphone array used for measuring head-related impulse responses simultaneously at various radial distances using the reciprocal method. The microphone array consists of miniature 5.8 mm diameter electret condenser microphones (ECMs) arranged on a boom, using a 3D printed microphone holder with pillars. The frequen...
Spectral cues (SCs) formed by the pinna are known to be essential for sound externalization, and accurate localization of sound-source azimuth and elevation in binaural listeners. SCs are also know to play a key role in monaural sound localization. The experiments described in this article intended to clarify how changes in SCs associated with head...
The interaural time difference (ITD) plays an important role in spatial hearing, particularly in azimuthal localization of sound images. Although the ITD is essentially determined by the geodesic distance between two ears, researchers have reported that the ITD is greater for lower frequencies. However, the origin of this frequency-dependence has n...
We measured the input impedance characteristics, input voltage versus output sound pressure characteristics, harmonic distortion characteristics, frequency characteristics, and impulse response of a currently available miniature electrodynamic driver unit (Foster Electric, MT006B) when used as a loudspeaker with an open space load. The nominal inpu...
Deep neural network (DNN)-based speech synthesis became popular in recent years and is expected to soon be widely used in embedded devices and environments with limited computing resources. The key intention of these systems in poor computing environments is to reduce the computational cost of generating speech parameter sequences while maintaining...
The movement of a sound image may not coincide with that of the sound source in a room with reflective walls. A listener sitting by a reflective wall perceives the movement of the sound image bending around their head when a sound source approaches the wall straight in the front of them. In order to investigate this perception quantitatively, we me...
In the early 1900's, two scholars opened the door of psychoacoustics in Japan. Han'ichi Muraoka was sent to University of Strasbourg in 1878, where he studied physics under August Kundt and received a doctoral degree in 1881. He published his study of the discrimination threshold of Japanese-harp timbre in 1919. Matataro Matsumoto went to Yale Univ...
We investigated how listeners perceive sound image movements of moving sound sources horizontally approaching and retreating from a listener’s head under three conditions: directly listening to approaching and retreating real sound sources, listening to binaurally recorded sounds with headphones, and listening to binaurally synthesized sounds with...
This study compared the horizontal and median plane sound localization performances of binaural signals provided using a pinna-less dummy head or a stereo microphone that turns in synchronization with a listener's head yaw rotation in the head-still and head-movement tasks. Results show that the sound localization performances in the head-movement...
A study was conducted to clarify the sound image movement of sound sources horizontally approaching and retreating from a listener's head at ear height. It investigated how listeners perceive sound image movements of moving sounds under three conditions. These three conditions involved directly listening to approaching and retreating real sound sou...
A TeleHead is a steer able dummy head that tracks a listener's head movement quickly and quietly. We made a personal auditory tele-existence system connecting Toyama Prefectural University and Research Institute of Electrical Communication Tohoku University by using a TeleHead over the Internet. The remote TeleHead can provide dynamic binaural sign...
The effect of listener's voluntary movement on the horizontal sound localization was investigated using a binaural recording/reproduction system with TeleHead, a steerable dummy head. Stimuli were static binaural signals recorded with a still dummy-head in head-still condition, dynamic binaural signals recorded with a dummy-head that followed preci...
We measured subjects' head movements during horizontal and median sound localization experiments in which head-rotation was allowed in order to know how they move their heads to localize sound in a head rotation condition. The head movements in a head-rotation condition were measured while localizing 500-Hz low-pass noise, 12-kHz high-pass noise, a...
Some researchers have reported on how sound localization is affected by the temporal variation of the ITD, ILD and SCs. For horizontal sound localization, Perrett and Noble showed that head rotation facilitated sound localization with 2-kHz low-pass noise. They also showed that head rotation facilitated sound localization with the 2- and 4-kHz high...
This paper clarifies effects of the head-and-neck somatic and balance sense information in horizontal sound localization. In the head-still condition, a listener localized static-binaural signal keeping his head still. In the head-movement condition, a listener localized dynamic-binaural signal which is recorded with a steerable dummy head controll...
A non-audible murmur (NAM), a very weak whisper sound produced without vocal fold vibration, has been researched in the development of a silent-speech communication tool for functional speech disorders as well as human-to-human/machine interfaces with inaudible voice input. The NAM can be detected using a specially designed microphone, called a NAM...
This paper clarifies the relationship between head movement and sound localization with band-limited noise (12-kHz high-pass, 500-Hz low-pass and 2-4-, 4-8-, 8-12-kHz band-pass filtered noise). 12-kHz high-pass noise mainly provides interaural level difference (ILD) information, while 500-kHz low-pass noise mainly provides interaural time differenc...
The effects of head movement during head-related transfer function (HRTF) measurements are evaluated. Head movements are measured simultaneously with HRTF measurements and spectral differences of the HRTFs are compared among repeated measurements. Without a head support aid, the human subjects’ heads move considerably in all directions. HRTFs for t...
This paper presents examination of the signal bandwidth necessary to localize real sound sources, and the relationship between bandwidth and the listener's audible frequency range. Horizontal sound localization experiments were conducted with 10 listeners using white noise, high-pass noise with cut off frequencies (Fc) of 2, 4, 8, 12, or 16 kHz, or...
Binaural technology enables the reproduction of the three dimensional (3-D) sound through earphones but requires an acoustically correct sound reproducing system and high-precision individual head-related transfer functions (HRTFs) for spatially distortion-free 3-D sound reproduction. These requirements for the acoustical strictness of binaural sys...
We measured subject's head movements during horizontal sound localization experiments to determine how little the people move their heads in a head-still condition, and how they move their heads in a head-movement condition. The head movements of eight subjects in a head-still condition were measured while localizing six kinds of band-pass noises,...
A study was conducted to clarify the signal bandwidth necessary for horizontal sound localization using high- and low-pass-filtered noise of real sound sources. The experimental system consisted of a Windows-based PC, two 8-channel digital-to-analog converters (DACs), 12 power amplifiers, and 12 loudspeakers. The sampling frequency of the DACs was...
A study was conducted to measure the head-related transfer functions (HRTF) with a dummy head by the reciprocal and direct methods. The reciprocal HRTF measurement system consisted of a microphone array with fixed microphone sockets 30 degrees apart and an earplug speaker inserted into the dummy's ear canal. The microphone array consisted of a circ...
This paper aims at clarifying the discrepancies between actual-ear and artificial-ear responses. The actual- and artificial-ear responses from five models of insert earphones, three models of intra-concha earphones, and two models of headphones were measured and compared. The actual-ear responses were measured for one driver of each earphone/headph...
This paper clarifies how much signal bandwidth is necessary for horizontal sound localization. Horizontal sound lo-calization experiments were conducted with sixteen listeners using white noise, and fourteen listeners using high-pass noise whose cut-off frequency (Fc) was 2, 4, 8, 12, or 16 kHz, or low-pass noise whose Fc was 0.5, 1, 2, 4, or 8 kHz...
The physical characteristics of weak body-conducted vocal-tract resonance signals called non-audible murmur (NAM) and the acoustic characteristics of three sensors developed for detecting these signals have been investigated. NAM signals attenuate 50 dB at 1 kHz; this attenuation consists of 30-dB full-range attenuation due to air-to-body transmiss...
In this work, head movements for three human subjects were measured simultaneously during head-related transfer function (HRTF) measurement for a period of 95 min each. The subjects' heads moved in all directions during measurements. Excessive head movements were observed in the pitch and yaw directions. Head movements in the roll direction were sm...
The physical nature of weak body-conducted vocal-tract resonance signals called non-audible murmur (NAM) were investigated using numerical simulation and acoustic analysis of the NAM signals. Computational fluid dynamics simulation reveals that a weak vortex flow occurs in the supraglottal region when uttering NAM; a source of NAM is a turbulent no...
A number of works have been reported on robot control using EMG signals. Control of robots, wheelchairs, and rehabilitation aids using the arms, hands or legs by EMG signals has been quite popular and effective. However, few works have dealt with head-movement control using neck EMG signals. We have built a model that estimates continuous human hea...
This paper investigates the source-distance dependency of head-related transfer functions (HRTFs) on the horizontal and median sagittal planes using the boundary-element method and a dummy head scanned with laser and computer tomography scanners. First, the HRTF spectra are compared among various source positions in a head-centered coordinate syste...
A non-audible murmur (NAM), a very weak speech sound produced without vocal cord vibration, can be detected by a special NAM microphone attached to the neck, thereby providing a new speech communication tool for functional speech disorders as well as human-to-machine and human-to-human interfaces with inaudible voice input for use with unimpaired....
The frequency characteristics of five types of non-audible murmur (NAM) microphone, namely SS, UE, DUE, SSP1, and SSP2 were measured. The measurements were done using a audio analyzer, an accelerometer, a bone conduction vibrator, and a urethane elastomer cylinder. A NAM microphone and the accelerometer were placed on top of the cylinder. The accel...
Auditory artifacts due to switching head-related transfer functions (HRTFs) are investigated, using a software-implemented dynamic virtual auditory display (DVAD) developed by the authors. The DVAD responds to a listener's head rotation using a head-tracking device and switching HRTFs to present a highly realistic 3D virtual auditory space to the l...
TeleHead I is an acoustical telepresence robot that we built on the basis of the concept that remote sound localization could be best achieved by using a user-like dummy head whose movement synchronizes with the user's head movement in real time. We clarified the characteristics of the latest version of TeleHead I, TeleHead II, and verified the val...
Vocal tract shapes of a whispered voice and NAM production were obtained by MRI scans. Results show that the vocal tract shapes are diverse among subjects. One subject's results show that a NAM production yields opening and downward movement of glottis as well as downward extension of piriform fossae, compared to a whispered voice production. Final...
A pan-spatial dynamic virtual auditory display (Pan-spatial DVAD) has been developed, which can present virtual sound sources located at any position. The pan-spatial DVAD is an integrated system of the DVAD and an HRTF (Head-Related Transfer Function) calculation server. The DVAD detects the listener's head motion using a head tracking device and...
An effective method of reducing computational cost involving Head Related Transfer Function (HRTF) simulation for high frequency is to apply the reciprocal theorem. Commonly, the Boundary Element Method is used since both calculation and mesh generation effort are minimal, however it is sensitive to geometry complexity and unstable during the discr...
A software-implemented dynamic virtual auditory display (DVAD) has been de- veloped by the authors. The DVAD responds to the listener's head rotation by us- ing a head-tracking device and switching head-related transfer functions (HRTFs), thereby presenting a highly realistic virtual auditory space to the listener. The DVAD operates on Windows XP a...
A nonaudible murmur (NAM), a very weak speech sound produced without vocal vibration, can be detected by a special NAM microphone attached to the neck, thereby providing a new communication tool for use with functional speech disorders. The microphone is a condenser microphone covered with soft‐silicone impression material that provides good impeda...
This study examined the effect of head movement on sound localization with a pair of microphones providing no head‐related transfer function (HRTF) information, and with individualized, nonindividualized, and downsized dummy heads providing HRTF information with different degrees of distortion. In an anechoic room,white noise was presented for 5 s...
A multi-degree-of-freedom (DOF) ultrasonic motor can rotate in three DOFs and does not generate noise. In addition, with an appropriate preloading mechanism, it can generate high torque for its size. The multi-DOF ultrasonic motor is, therefore, anticipated for use as a servomotor in the next generation of robots. However, for several reasons, ther...
A multi degree of freedom (DOF) ultrasonic motor can rotate in three degrees of freedom and does not generate noise. In addition, it generates high torque with an appropriate pre-loading mechanism. This motor is therefore anticipated for use as a servomotor in robots in the next generation. This paper proposes a pre-loading mechanism and control al...
The purpose of this study was to build a large database on Japanese vowels and to clarify their acoustic characteristics. Recordings were made of 256 males and 252 females aged from 6 to 76 years old, for
both Kanto (Tokyo) and Kansai (Osaka) dialects, producing the vowels /i, e, a, o, u/ in isolation, in /h-V-
da/ syllables and in /b-V-ta/ syllabl...
The IEC coupler, dummy-head and actual-ear responses, harmonic distortion, impulse response decay, phase rotation and group delay, external sound radiation, sound attenuation and acoustic crosstalk characteristics of six models of headphones, the TDH39, DT48, HD250 Linear II, HD414 Classic, HDA 200 and SR-Lambda Professional, are measured and compa...
Using modern lifecasting techniques, we have devised life-like dummy heads that replicate precisely the head shape of the dummy head user (the real head) . We compared the shape and the head-related transfer function (HRTF) of the dummy heads with those of the real head. From the result, it was confirmed that, although the dummy head's diameter was...
An acoustical tele-presence robot that transfers the sound environment at a remote place should have a human-like head whose movement is synchronized with the listener's head movement. To realize the concept, we built TeleHead I and II. TeleHead I, the prototype, has a molded dummy head and follows the user's head movement. Recently, we built TeleH...
A multi degree of freedom (multi-DOF) ultrasonic motor is anticipated as an actuator in the next generation, because it can drive three-DOF motion with high torque and low noise that can not be perceived by the human auditory system. Using a multi-DOF ultrasonic motor as an actuator that constitutes the mechanism of the neck of the auditory tele-ex...
An advanced steerable dummy head named "TeleHead II" tracks three-dimensional human head movement quietly in real time. The three-dimensional movement of the dummy head is controlled by human head posture data detected by a head tracker using servomotors for yaw, roll, and pitch. Sound that reaches two microphones set in the dummy head can be repro...
NTT Communication Science Laboratories conducts scientific research aimed at understanding human information-processing mechanisms and technological research aimed at realizing human-friendly interfaces in computer environments. The ultimate goal of our work is to enrich communications among people as well as between people and computers. This spec...
It is known that the inferior colliculus (IC), a major neural structure in the mammalian auditory pathway, contains neurons that are sensitive to the interaural phase difference (IPD) in terms of the firing rate of action potentials or spikes. However, there is no quantitative analysis of the temporal characteristics of neural sensitivities to the...
For secure and high-quality information distribution services in a network society, it is crucial to develop technologies that enable users to get along well with information as well as technologies that handle information properly on computers and networks. To provide a foundation on which we can create a more pleasant and secure information commu...
After prolonged listening to a sound moving across the horizontal plane (adapter), a stationary sound (test) can be perceived as moving in the opposite direction. In experiment 1, the magnitude of the auditory motion aftereffect for 500‐Hz tones was measured as a function of adapter velocity using the method of constant stimuli. Apparent sound move...
The ability of listeners to judge the number of concurrent talkers was examined. Ten female and 11 male Japanese talkers each recorded 20 familiar Japanese words consisting of four consonant–vowel syllables each. In each trial, a number of different talkers was chosen randomly from the same‐sex group, and presented synchronously to four native Japa...
The performances of several auditory front ends were evaluated in a phoneme recognition task using a VQ?HMM or an LVQ2 back end, and in a word recognition task using a DTW back end. The auditory front ends used in the experiments were different combinations of a fixed?Q cochlear filter (FQF), an adaptive?Q circuit (AQC) [T. Hirahara et al., Proc. I...
In this report three front ends, a fixed Q cochlear filter (FQF), an adaptive Q cochlear filter (AQF), and a Bark DFT(DFT), are compared for use as the front end of a DTW system. The FQF is a conventional cascade/parallel‐type cochlear filter that stimulates the asymmetrical filtering characteristics of a basilar membrane system. The AQF is a nonli...
Loudness comparisons were performed by four subjects, under two experimental condi tions:free field (anechoic room) and diffuse field (reverberation room). Each subject adjusted the headphone level of critical band noise bursts until they were equally loud as those from a reference loudspeaker (70dB SPL). Measurement scatter was smaller in the diff...
In this paper, several speech sounds are examined by masking methods to show typical examples of speech spectrum representation in the auditory pathway represented by a spatio-temporal masking pattern and to clarify differences between internal and physical representation of speech spectrum. Three types of Japanese speech, monosyllables, a sentence...
Recurrent neural networks with arbitrary feedback connections are
highly nonlinear dynamical systems exhibiting variegated complex
dynamical behavior. The applications of this temporal behavior hold
possibilities for information processing. Supervised learning for
recurrent networks is studied with emphasis on learning aperiodic
motions. APOLONN (a...
In order to find the appropriate headphone to use in psychophysical experiments, the frequency responses of 12 headphones were measured by three physical methods: on an IEC coupler (B&K 4134), on a C coupler attached to a head and torso simulator (Kohken SAMRAI) (Okabe et al., J. Acoust. Soc. Jpn. (E) 5, 95–104), and by using a probe microphone in...
A computational nonlinear cochlear filter model with adaptive
Q circuits is described. The model is built by introducing
adaptive Q circuits into the linear cascade/parallel cochlear
filter bank. The adaptive Q circuit is composed of two parts: a
second-order low-pass function (LPF) and a Q decision circuit
which calculates the LPF's Q in every tim...
Vowel identification tests were carried out using 200 synthesized vowel‐like stimuli to examine the role of the fundamental frequency F0 in vowel perception. These stimuli were synthetic versions of the five Japanese vowels, /i/, /e/, /a/, /o/, and /u/, of which the F0 and/or the formant frequencies Fi (i = 1,2,3,4) were modified: ten F0 values wer...
Basic chracteristics of hearing elicited by applying the amplitude modulated electrical stimulation to the external ear were investigated by means of psychophysical technique. The site in which sound was generated was discussed based on sound lateralization characteristics produced by applying electrical and sound stimulations to both ears.As the r...
A Japanese text-to-speech conversion technology has been developed, where a new text analysis method and a new speech synthesis method have been employed to improve pronunciation of Kanji characters and phoneme articulation. Morpheme analysis is first performed for an input text, and pronunciation and grammatical information are extracted. For word...
Acoustic characteristics of Non-Audible Murmur (NAM) are clarified. NAM is a very weak whispered voice which can be detected by a NAM sensor attached to the surface of the skin close behind the ear. NAM is inaudible body-conducted sound whereas normal voice is audible air-conducted sound. NAM signals are recorded from 6 male and 7 female speakers u...
A multi-degree of freedom (multi-DOF) ultrasonic motor is anticipated as an actuator for the next generation because it can drive three-DOF motion with high torque and low noise. Before a multi-DOF ultrasonic motor can be used as a servomotor in actual robots, we have to develop a proper pre-loading mechanism and control algorithm for quick multi-D...