Shoichi Koyama

Shoichi Koyama
The University of Tokyo | Todai · Department of Information Physics and Computing

PhD

About

107
Publications
7,861
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
829
Citations
Citations since 2017
63 Research Items
707 Citations
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150
2017201820192020202120222023050100150
Introduction
Shoichi Koyama received the B.E., M.S., and Ph.D. degrees from the University of Tokyo, Tokyo, Japan, in 2007, 2009, and 2014, respectively. He joined Nippon Telegraph and Telephone Corporation (NTT) in 2009 and started the career as a researcher in acoustic signal processing at NTT Cyberspace Laboratories (currently Media Intelligence Laboratories). In 2014, he joined the Graduate School of Information Science and Technology, the University of Tokyo, and now he is an Assistant Professor (Lecturer). From 2016 to 2018, he was also a visiting researcher at Paris Diderot University (Paris 7) / Institut Langevin, Paris, France. His research interests include sound field analysis and synthesis, sparse signal representation, and acoustic inverse problems.
Additional affiliations
April 2018 - present
The University of Tokyo
Position
  • Lecturer
April 2016 - March 2018
Université Paris Diderot / Institut Langevin
Position
  • Researcher
April 2014 - March 2018
The University of Tokyo
Position
  • Professor (Assistant)
Education
January 2014 - January 2014
April 2007 - March 2009
April 2003 - March 2007

Publications

Publications (107)
Article
For transmission of a physical sound field in a large area, it is necessary to transform received signals of a microphone array into driving signals of a loudspeaker array to reproduce the sound field. We propose a method for transforming these signals by using planar or linear arrays of microphones and loudspeakers. A continuous transform equation...
Article
A sparse representation method for multidimensional signals is proposed. In generally used group-sparse representation algorithms, the sparsity is imposed only on a single dimension and the signals in the other dimensions are solved in the least-square-error sense. However, multidimensional signals can be sparse in multiple dimensions. For example,...
Article
Full-text available
In order to control an acoustic field inside a target region, it is important to choose suitable positions of secondary sources (loudspeakers) and sensors (control points/microphones). This paper provides an overview of state-of-the-art source and sensor placement methods in sound field control. Although the placement of both sources and sensors gr...
Article
Full-text available
An active noise control (ANC) method to reduce noise over a region in space based on kernel interpolation of sound field is proposed. Current methods of spatial ANC are largely based on spherical or circular harmonic expansion of the sound field, where the geometry of the error microphone array is restricted to a simple one such as a sphere or circ...
Article
This paper investigates sound-field modeling in a realistic reverberant setting. Starting from a few point-like microphone measurements, the goal is to estimate the direct source field within a whole 3D space around these microphones. Previous sparse sound field decompositions assumed only a spatial sparsity of the source distribution, but could ge...
Preprint
A spatial active noise control (ANC) method based on the interpolation of a sound field from reference microphone signals is proposed. In most current spatial ANC methods, a sufficient number of error microphones are required to reduce noise over the target region because the sound field is estimated from error microphone signals. However, in pract...
Preprint
Full-text available
Two sound field reproduction methods, weighted pressure matching and weighted mode matching, are theoretically and experimentally compared. The weighted pressure and mode matching are a generalization of conventional pressure and mode matching, respectively. Both methods are derived by introducing a weighting matrix in the pressure and mode matchin...
Preprint
Full-text available
An interpolation method for region-to-region acoustic transfer functions (ATFs) based on kernel ridge regression with an adaptive kernel is proposed. Most current ATF interpolation methods do not incorporate the acoustic properties for which measurements are performed. Our proposed method is based on a separate adaptation of directional weighting f...
Preprint
Full-text available
A sound field reproduction method called weighted pressure matching is proposed. Sound field reproduction is aimed at synthesizing the desired sound field using multiple loudspeakers inside a target region. Optimization-based methods are derived from the minimization of errors between synthesized and desired sound fields, which enable the use of an...
Preprint
p>A multizone sound field control method, called amplitude matching, is proposed. The objective of amplitude matching is to synthesize a desired amplitude (or magnitude) distribution over a target region with multiple loudspeakers, whereas the phase distribution is arbitrary. Most of the current multizone sound field control methods are intended to...
Preprint
Full-text available
We propose a method of head-related transfer function (HRTF) interpolation from sparsely measured HRTFs using an autoencoder with source position conditioning. The proposed method is drawn from an analogy between an HRTF interpolation method based on regularized linear regression (RLR) and an autoencoder. Through this analogy, we found the key feat...
Preprint
Full-text available
A sound field estimation method based on a physics-informed convolutional neural network (PICNN) using spline interpolation is proposed. Most of the sound field estimation methods are based on wavefunction expansion, making the estimated function satisfy the Helmholtz equation. However, these methods rely only on physical properties; thus, they suf...
Preprint
Full-text available
A method of interpolating the acoustic transfer function (ATF) between regions that takes into account both the physical properties of the ATF and the directionality of region configurations is proposed. Most spatial ATF interpolation methods are limited to estimation in the region of receivers. A kernel method for region-to-region ATF interpolatio...
Preprint
Full-text available
A spatial active noise control (ANC) method based on the individual kernel interpolation of primary and secondary sound fields is proposed. Spatial ANC is aimed at cancelling unwanted primary noise within a continuous region by using multiple secondary sources and microphones. A method based on the kernel interpolation of a sound field makes it pos...
Preprint
Full-text available
A multizone sound field control method, called amplitude matching, is proposed. The objective of amplitude matching is to synthesize a desired amplitude (or magnitude) distribution over a target region with multiple loudspeakers, whereas the phase distribution is arbitrary. Most of the current multizone sound field control methods are intended to s...
Preprint
A multizone sound field control method, called amplitude matching, is proposed. The objective of amplitude matching is to synthesize a desired amplitude (or magnitude) distribution over a target region with multiple loudspeakers, whereas the phase distribution is arbitrary. Most of the current multizone sound field control methods are intended to s...
Article
Full-text available
Sensor placement methods for field estimation based on Gaussian processes are proposed. Generally, sensor placement methods determine the appropriate placement positions by selecting them from predefined candidate positions. Many criteria for the selection have been proposed, with which the quality of the placements is evaluated with regard to the...
Article
Full-text available
A method to interpolate the acoustic transfer function (ATF) between regions using kernel ridge regression (KRR) is proposed. Conventionally, the ATF interpolation problem is strongly restricted and situational, depending on knowledge of environmental conditions while not accounting for source position variation. We derive our interpolation functio...
Article
Full-text available
A multizone sound field control method, called amplitude matching, is proposed. The objective of amplitude matching is to synthesize a desired amplitude (or magnitude) distribution over a target region with multiple loudspeakers, whereas the phase distribution is arbitrary. Most of the current multizone sound field control methods are intended to s...
Preprint
Full-text available
A method of optimizing secondary source placement in sound field synthesis is proposed. Such an optimization method will be useful when the allowable placement region and available number of loudspeakers are limited. We formulate a mean-square-error-based cost function, incorporating the statistical properties of possible desired sound fields, for...
Preprint
Full-text available
Sound field reproduction methods based on numerical optimization, which aim to minimize the error between synthesized and desired sound fields, are useful in many practical scenarios because of their flexibility in the array geometry of loudspeakers. However, the reproduction performance of these methods in a practical environment has not been suff...
Preprint
Full-text available
A method to estimate an acoustic field from discrete microphone measurements is proposed. A kernel-interpolation-based method using the kernel function formulated for sound field interpolation has been used in various applications. The kernel function with directional weighting makes it possible to incorporate prior information on source directions...
Article
Full-text available
A method of binaural rendering from microphone array signals of arbitrary geometry is proposed. To reproduce binaural signals from microphone array recordings at a remote location, a spherical microphone array is generally used for capturing a soundfield. However, owing to the lack of flexibility in the microphone arrangement, the single spherical...
Preprint
Full-text available
A method of binaural rendering from microphone array signals of arbitrary geometry is proposed. To reproduce binaural signals from microphone array recordings at a remote location, a spherical microphone array is generally used for capturing a soundfield. However, owing to the lack of flexibility in the microphone arrangement, the single spherical...
Conference Paper
Sound field reproduction methods based on numerical optimization, which aim to minimize the error between synthesized and desired sound fields, are useful in many practical scenarios because of their flexibility in the array geometry of loudspeakers. However, the reproduction performance of these methods in a practical environment has not been suff...
Preprint
Full-text available
A new impulse response (IR) dataset called "MeshRIR" is introduced. Currently available datasets usually include IRs at an array of microphones from several source positions under various room conditions, which are basically designed for evaluating speech enhancement and distant speech recognition methods. On the other hand, methods of estimating o...
Article
Full-text available
A wave field estimation method exploiting prior information on source direction is proposed. First, we formulate a wave field estimation problem as regularized least squares, where the norm of the wave field is used for a regularization term. The norm of the wave field is defined on the basis of the weighting function that reflects the prior inform...
Article
We propose a useful formulation for ill-posed inverse problems in Hilbert spaces with nonlinear clipping effects. Ill-posed inverse problems are often formulated as optimization problems, and nonlinear clipping effects may cause nonconvexity or nondifferentiability of the objective functions in the case of commonly used regularized least squares. T...
Article
Full-text available
There is growing interest in new audio formats in the context of virtual reality (VR), and higher-order ambisonics (HOA) is preferred for VR systems to transmit recorded scenes owing to its transmission efficiency and its flexibility to work with different loudspeaker setups. However, the conversion between another well-known format, i.e., object f...
Conference Paper
Full-text available
A method of binaural rendering from distributed microphone recordings that takes loudspeaker distance for measuring head-related transfer function (HRTF) into consideration is proposed. In general, to reproduce the binaural signals from the signals captured by multiple microphones in the recording area, the captured sound field is represented by pl...
Article
Full-text available
Estimating and interpolating a sound field from measurements using multiple microphones are fundamental tasks in sound field analysis for sound field reconstruction. The sound field reconstruction inside a source-free region is achieved by decomposing the sound field into plane-wave or harmonic functions. When the target region includes sources, it...
Article
Full-text available
A sound field decomposition method based on the reciprocity gap functional (RGF) in the spherical harmonic domain is proposed. To estimate and reconstruct a continuous sound field including sources by using multiple microphones, an intuitive and powerful strategy is to decompose the sound field into Green’s functions. Sparse-representation algorith...
Article
Full-text available
Blind source separation exploiting multichannel information has long been a popular topic, and recently proposed methods based on the local Gaussian model have shown promising results despite its high computational cost for the case of many microphone signals. The low updating speed for such a model is mainly due to the inversion of a spatial covar...
Conference Paper
Full-text available
A sound field recording method using multiple circular microphone arrays considering the effect of multiple scattering is proposed. To avoid the numerical instability of an open microphone array, a rigid array, i.e., a microphone array mounted on a circular/spherical baffle , exploiting the scattering effect of a single baffle is frequently used fo...
Conference Paper
Full-text available
A method for feedback active noise control (ANC) over a three-dimensional (3D) spatial region is proposed. Conventional multipoint ANC does not guarantee to reduce the noise between multiple discrete control points. Several attempts have been made to reduce the noise over the continuous target region. Most methods for spatial ANC found their basis...
Article
Full-text available
A sound field reproduction method based on the spherical wavefunction expansion of sound fields is proposed, which can be flexibly applied to various array geometries and directivities. First, we formulate sound field synthesis as a minimization problem of some norm on the difference between the desired and synthesized sound fields, and then the op...
Conference Paper
A gridless sound field decomposition method based on the reciprocity gap functional (RGF) is proposed. An intuitive and powerful way of reconstructing a sound field inside a region including sound sources is to decompose the sound field into Green's functions. Current methods based on sparse representation require discretization of the reconstructi...
Article
Full-text available
A sound field recording and reproduction method based on sparse sound field decomposition is proposed. Most current methods are based on plane-wave or harmonic decomposition of the pressure distribution obtained by microphones, which leads to spatial aliasing artifacts with severe effects. This paper proposes a method for sound field decomposition...
Article
A sound field recording method based on spherical or circular harmonic analysis for arbitrary array geometry and directivity of microphones is proposed. In current methods based on harmonic analysis, a sound field is decomposed into harmonic functions with a center given in advance, which is called a global origin, and their coefficients are obtain...
Conference Paper
We address a novel nonnegative matrix factorization (NMF) with a new basis deformation method to handle various music sounds. Conventional supervised NMF has a critical problem that a mismatch between bases trained in advance and an actual target sound reduces the accuracy of separation. To solve this problem, we proposed an advanced supervised NMF...
Presentation
Sound field recording and reproduction enables us to construct more realistic audio systems. In practical systems, sound pressure is obtained with microphones in a recording area, and then the sound field is reproduced with loudspeakers in a target area. Therefore, a signal conversion algorithm for obtaining the driving signals of the loudspeakers...
Conference Paper
In this paper, we address the music signal separation problem and propose a new supervised nonnegative matrix factorization (SNMF) algorithm employing the deformation of a spectral supervision basis trained in advance. Conventional SNMF has a problem that the separation accuracy is degraded by a mismatch between the trained basis and the spectrogra...
Conference Paper
Source separation using an ad hoc microphone array can be useful for enhancing speech in such applications as teleconference systems without the need to prepare special devices. However, the positions of the sources (and the microphones when using an ad hoc microphone array) can change during recording, thus violating the commonly made assumption i...
Article
A sound field recording and reproduction method using circular arrays of microphones and loudspeakers with a spherical baffle is proposed. The spherical baffle is an acoustically rigid object on which the microphone array is mounted. The driving signals of the loudspeakers must be obtained from the signals received by the microphones. A transform f...
Conference Paper
Full-text available
A method for sparse sound field decomposition with parametric dictionary learning is proposed. Sound field decomposition forms the foundation of various acoustic signal processing applications. Our main focus is sound field recording and reproduction for high-fidelity audio systems. To improve the reproduction accuracy above the spatial Nyquist fre...
Conference Paper
Full-text available
A sparse sound field decomposition method is proposed. Sound field decomposition is the foundation of the various acoustic signal processing applications and enables the estimation of the entire sound field from pressure measurements. The plane wave decomposition, i.e., spatial Fourier analysis, of the sound field has been widely used; however, art...
Article
A sound field recording and reproduction method that exploits prior information on the locations of sound sources to be reproduced is proposed. Current methods do not take such prior information into consideration in the transformation from the signals received by microphones into the driving signals of loudspeakers. The proposed method for planar...
Article
For sound field reproduction that includes height (with-height reproduction), it is more efficient to record and reproduce the sound field with lower resolution in elevation than in azimuth due to the spatial abilities of human auditory perception. We propose a sound field reproduction method using horizontally arranged cylindrical arrays of microp...