J. Even

Nara Institute of Science and Technology, Ikoma, Nara, Japan

Are you J. Even?

Claim your profile

Publications (41)31.92 Total impact

  • Source
    Dataset: Visibility
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work proposes a model for human habit-uation while riding a robotic wheelchair. We present and describe the concept of human navigational habituation which we define as the human habituation to repetitively riding a robotic wheelchair. The approach models habituation in terms of preferred linear velocity based on the experience of riding a wheelchair. We argue that preferred velocity changes as the human gets used to riding on the wheelchair. Inexperi-enced users initially prefer to ride at a slow moderate pace, however the longer they ride they prefer to speed up to a certain comfort level and find initial slower velocities to be tediously "too slow" for their experience level. The proposed habituation model provides passenger preferred velocity based on experience. Human biological measurements, galvanic skin conductance, and participant feedback demonstrate the prefer-ence for habituation velocity control over fixed velocity control. To our knowledge habituation modeling is new in the field of autonomous navigation and robotics.
    IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, Illinois; 09/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a framework for creating a 3D map of an environment that contains the probability of a geometric feature to emit a sound. The goal is to provide an automated tool for condition monitoring of plants. The map is created by a mobile platform equipped with a microphone array and laser range sensors. The microphone array is used to estimate the sound power received from different directions whereas the laser range sensors are used for estimating the platform pose in the environment. During navigation, a ray casting method projects the audio measurements made on-board the mobile platform to the map of the environment. Experimental results show that the created map is an efficient tool for sound source localization.
    IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS 2014), Chicago Il.; 09/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a framework for making a mobile robot aware of an entity in the blind region of its laser range finders when that entity emits sound. First in a mapping stage, a 3D description of the environment that contains information about acoustic reflection is created. Then during operation, the robot combines estimated directions of arrival of sound with this 3D description to detect entities that are not visible by line of sight sensors but could be heard because of sound reflections. Using this approach, it is possible to restrict the hypothesis about the position of a sound emitting entity in the blind region to a small set of candidate depth values.
    IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS 2014), Chicago; 09/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work introduces a 3D visibility model for comfortable autonomous vehicles. The model computes a vis-ibility index based on the pose of the wheelchair within the environment. We correlate this index with human navigational comfort (discomfort) and we discuss the importance of modeling visibility to improve human riding comfort. The proposed approach models the 3D visual field of view combined with a two-layered environmental representation. The field of view is modeled with information from the pose of the robot, a 3D laser sensor and a two-layered environmental representation composed of a 3D geometric map with traversale area infor-mation. Human navigational discomfort was extracted from participants riding the autonomous wheelchair. Results show that there is fair correlation between poor visibility locations (e.g., blind corners) and human discomfort. The approach can model places with identical traversable characteristics but different visibility and it differentiates visibility characteristics according to traveling direction.
    In Proc. of the IEEE International Conference on Robotics and Automation (ICRA 2014); 05/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a method for detecting moving entities that are in the robot's path but not in the field of view of sensors like laser scanners, cameras or ultrasonic sensors. The proposed system makes use of passive acoustic localization methods which receive information from occluded regions (at intersections or corners) because of the multipath nature of sound propagation. Contrary to the conventional sensors, this method does not require line of sight. In particular, specular reflections in the environment make it possible to detect moving entities that emit sound such as a walking person or a rolling cart. This idea was exploited for safe navigation of a mobile platform at intersections. The passive acoustic localization output is combined with a 3D geometric map of the environment that is precise enough to estimate sound propagation and reflection using ray casting methods. This gives the robot the ability to detect a moving entity out of the field of view of the sensors that require line of sight. Then the robot is able to recalculate its path and waits until the detected entity is out of its path so that it is safe to move to its destination. To illustrate the performance of the proposed method, a comparison of the robot's navigation with and without the audio sensing is provided for several intersection scenarios.
    Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on; 01/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a method for mapping the radiated sound intensity of an environment using an autonomous mobile platform. The sound intensities radiated by the objects are estimated by combining the sound intensity at the platform's position (estimated with a steered response power algorithm) and the distances to the objects (estimated using laser range finders). By combining the estimated sound intensity at the platform's position with the platform's pose obtained from a particle filter based localization algorithm, the sound intensity radiated from the objects is registered in the cells of a grid map covering the environment. This procedure creates a map of the radiated sound intensity that contains information about the sound directivity. To illustrate the effectiveness of the proposed method, a map of radiated sound intensity is created for a test environment. Then the position and the directivity of the sound sources in the test environment are estimated from this map.
    Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on; 01/2013
  • C.T. Ishi, J. Even, N. Hagita
    [Show abstract] [Hide abstract]
    ABSTRACT: We proposed a method for estimating sound source locations in a 3D space by integrating sound directions estimated by multiple microphone arrays and taking advantage of reflection information. Two types of sources with different directivity properties (human speech and loudspeaker speech) were evaluated for different positions and orientations. Experimental results showed the effectiveness of using reflection information, depending on the position and orientation of the sound sources relative to the array, walls, and the source type. The use of reflection information increased the source position detection rates by 10% on average and up to 60% for the best case.
    Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on; 01/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a multi-modal sensor approach for mapping sound sources using an omni-directional microphone array on an autonomous mobile robot. A fusion of audio data (from the microphone array), odometry information and the laser range scan data (from the robot) was used to precisely localize and map the audio sources in an environment. An audio map is created while the robot is autonomously navigating through the environment by continuously generating audio scans with a steered response power (SRP) algorithm. Using the poses of the robot, rays are cast in the map in all directions given by the SRP. Then each occupied cell in the geometric map hit by a ray is assigned a likelihood of containing a sound source. This likelihood is derived from the SRP at that particular instant. Since the localization of the robot is probabilistic, the uncertainty in the pose of the robot in the geometric map is propagated to the occupied cells hit during the ray casting. This process is repeated while the robot is in motion and the map is updated after every audio scan. The generated sound maps were reused and the changes in the audio environment were updated by the robot as it identifies these changes.
    Robotics and Automation (ICRA), 2013 IEEE International Conference on; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper focuses on the problem of environmental noises in human-human communication and in automatic speech recognition. To deal with this problem, the use of alternative acoustic sensors -which are attached to the talker and receive the uttered speech through skin or bones- is investigated. In the current study, throat microphones and ear bone microphones are integrated with standard microphones using several fusion methods. The results obtained show that the recognition rates in noisy environments are drastically increased when these sensors are integrated with standard microphones. Moreover, the system does not show any recognition degradations in clean environments. In fact, recognition rates also increase slightly in clean environments. Using late fusion to integrate a throat microphone, an ear bone microphone, and a standard microphone, we achieved a 44% relative improvement in recognition rate in a noisy environment and a 24% relative improvement in recognition rate in a clean environment.
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents an audio monitoring system for detecting and identifying people engaged in a conversation. The proposed method is hands-free as it uses a microphone array to acquire the sound. A particularity of the approach is the use of a laser range finder based human tracker system. The human tracker monitors the locations of people then local steered response power is used to detect the people speaking and localize precisely their mouths. Then an audio stream is created for each person and used to perform speaker identification. Experimental results show that the use of the human tracker has several benefits compared to an audio only approach.
    Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on; 01/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Parkinson's disease (PD) is a severe disease with many symptoms, including speech disorders. Although many methods exist to treat some of PD's symptoms, therapies for speech impairment are not effective and satisfactory, resulting in an open area of research. The current project aims at taking advantage of the Lombard reflex to improve the speech loudness of PD patients. As a first step, the experience of the Lombard reflex by Japanese PD people was confirmed, and the perception of PD patients' speech was evaluated by several subjects. In a following step, methods based on masking sound will be used for intensive training and for self-training of PD patients. However, after intensive training, PD patients may be able to talk louder even without masking noise. In addition, the design and the development of a device based on masking sound that can be used by PD patients while using phone is under consideration.
    Bioinformatics & Bioengineering (BIBE), 2012 IEEE 12th International Conference on; 01/2012
  • Source
    J. Even, N. Hagita
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a novel method for solving the permutation problem inherent to frequency domain blind signal separation of multiple simultaneous speakers. As conventional methods, the proposed method exploits the direction of arrival (DOA) of the different speakers to resolve the permutation. But it is designed to exploit the information from pairs of microphones that are usually discarded because of the spatial aliasing. The proposed method is based on an explicit expression of the spatial aliasing effect on the DOA estimation. By introducing a vector of integer values in the equation used to estimate the DOA, it becomes possible to compensate the spatial aliasing by solving the equation relatively to that vector. The proposed method operates sequentially along the frequency bins. First the spatial aliasing is compensated by an iterative procedure that also detects the permutations. Then the detected permutation are suppressed and the DOA are estimated using all available pairs of microphones. Some simulation results demonstrate the effectiveness of the method.
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on; 06/2011
  • INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, August 27-31, 2011; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Small informal meetings of two to four participants are very common in work environments. For this reason, a convenient way for recording and archiving these meetings is of great interest. In order to efficiently archive such meetings, an important task to address is to keep trace of “who talked when” during a meeting. This paper proposes a new multi-modal approach to tackle this speaker activity detection problem. One of the novelty of the proposed approach is that it uses a human tracker that relies on scanning laser range finders (LRFs) to localize the participants. This choice is especially relevant for robotic applications as robots are often equipped with LRFs for navigation purpose. In the proposed system, a table top microphone array in the center of the meeting room acquires the audio data while the LRF based human tracker monitors the movement of the participants. Then the speaker activity detection is performed using Gaussian mixture models that were trained before hand. An experiment reproducing a meeting configuration demonstrates the performance of the system for speaker activity detection. In particular, the proposed hands free system maintains an good level of performance compared to the use of close talking microphone while participants are simultaneously speaking.
    2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2011, San Francisco, CA, USA, September 25-30, 2011; 01/2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a microphone array structure for a spoken-oriented robot dialog system that is designed to discriminate the direction of arrival (DOA) of the target speech and that of the robot internal noise. First, we investigate the performance of the noise estimation conducted by semi-blind source separation (SBSS) in presence of both the diffuse background noise and the robot internal noise. The result indicates that the noise estimation of the SBSS is not good. Next, we analyze the DOA of the robot internal noise in order to determine the reason of the above result; we find out that the internal noise is always in-phase at the microphone array and overlap spacial with the target speech. Based on this fact, we propose to change the microphone array structure from the broadside array to the end-fire array in order to discriminate the DOAs of the target speech and the internal noise. Finally, we evaluate the word accuracy in a dictation task in presence of both diffuse background noise and robot internal noise to confirm the advantage of the proposed structure. Simulation results shows that the proposed microphone array structure results in approximately 10% improvement of the speech recognition performance.
    Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on; 11/2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: Several recent methods for speech enhancement in presence of diffuse background noise use frequency domain blind signal separation to estimate the diffuse noise and a nonlinear post filter to suppress this estimated noise. This paper presents a frequency domain blind signal extraction method for estimating the diffuse noise in place of the frequency domain blind signal separation. The method is based on the minimization by means of a complex Newton algorithm of a cost function depending of the modulus of the extracted component. The proposed complex Newton method is compared to the gradient descent on the same cost function and to the blind signal separation approach.
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on; 04/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper study the blind estimation of the diffuse background noise for the hands-free speech interface. Some recent papers showed that it is possible to use blind signal separation (BSS) to estimate the diffuse background noise by suppressing the speech component after all the components were separated. In particular, the scale indeterminacy of BSS is avoided by using the projection back method. In this paper, we study an alternative to the projection back for the noise estimation and justify the use of blind signal extraction BSE rather than BSS.
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on; 04/2010
  • INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010; 01/2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: The speech enhancement architecture presented in this paper is specifically developed for hands-free robot spoken dialog systems. It is designed to take advantage of additional sensors installed inside the robot to record the internal noises. First a modified frequency domain blind signal separation (FD-BSS) gives estimates of the noises generated outside and inside of the robot. Then these noises are canceled from the acquired speech by a multichannel Wiener post-filter. Some experimental results show the recognition improvement for a dictation task in presence of both diffuse background noise and internal noises.
    Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on; 11/2009