Mathieu Fontaine

Mathieu Fontaine
Institut Mines-Télécom | telecom-sudparis.eu · Laboratoire des Sciences et Technologies de l’Information (LSTI)

Doctor of Philosophy

About

30
Publications
0
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
152
Citations
Introduction
audio source separation, denoising, audio event detection, augmented reality

Publications

Publications (30)
Preprint
Full-text available
This paper describes speech enhancement for realtime automatic speech recognition (ASR) in real environments. A standard approach to this task is to use neural beamforming that can work efficiently in an online manner. It estimates the masks of clean dry speech from a noisy echoic mixture spectrogram with a deep neural network (DNN) and then comput...
Preprint
Full-text available
Neural fields have successfully been used in many research fields for their native ability to estimate a continuous function from a finite number of observations. In audio processing, this technique has been applied to acoustic and head-related transfer function interpolation. However, most of the existing methods estimate the real-valued magnitude...
Preprint
This paper describes a practical dual-process speech enhancement system that adapts environment-sensitive frame-online beamforming (front-end) with help from environment-free block-online source separation (back-end). To use minimum variance distortionless response (MVDR) beamforming, one may train a deep neural network (DNN) that estimates time-fr...
Preprint
This paper describes the practical response- and performance-aware development of online speech enhancement for an augmented reality (AR) headset that helps a user understand conversations made in real noisy echoic environments (e.g., cocktail party). One may use a state-of-the-art blind source separation method called fast multichannel nonnegative...
Preprint
This paper describes noisy speech recognition for an augmented reality headset that helps verbal communication within real multiparty conversational environments. A major approach that has actively been studied in simulated environments is to sequentially perform speech enhancement and automatic speech recognition (ASR) based on deep neural network...
Preprint
This paper describes heavy-tailed extensions of a state-of-the-art versatile blind source separation method called fast multichannel nonnegative matrix factorization (FastMNMF) from a unified point of view. The common way of deriving such an extension is to replace the multivariate complex Gaussian distribution in the likelihood function with its h...
Article
This paper describes heavy-tailed extensions of a state-of-the-art versatile blind source separation method called fast multichannel nonnegative matrix factorization (FastMNMF) from a unified point of view. The common way of deriving such an extension is to replace the multivariate complex Gaussian distribution in the likelihood function with its h...
Article
Full-text available
This article describes a computationally-efficient statistical approach to joint (semi-)blind source separation and dereverberation for multichannel noisy reverberant mixture signals. A standard approach to source separation is to formulate a generative model of a multichannel mixture spectrogram that consists of source and spatial models represent...
Article
Full-text available
This paper describes aneural blind source separation (BSS) method based on amortized variational inference (AVI) of a non-linear generative model of mixture signals. A classical statistical approach to BSS is to fit a linear generative model that consists of spatial and source models representing the inter-channel covariances and power spectral den...
Article
Full-text available
This paper describes multichannel speech enhancement based on a probabilistic model of complex source spectrograms for improving the intelligibility of speech corrupted by undesired noise. The univariate complex Gaussian model with the reproductive property supports the additivity of source complex spec-trograms and forms the theoretical basis of n...
Article
Full-text available
This paper describes a time-varying extension of independent vector analysis (IVA) based on the normalizing flow (NF), called NF-IVA, for determined blind source separation of multichannel audio signals. As in IVA, NF-IVA estimates demixing matrices that transform mixture spectra to source spectra in the complex-valued spatial domain such that the...
Article
Full-text available
Source separation aims at decomposing a vector into additive components. This is often done by first estimating source parameters before feeding them into a filtering method, often based on ratios of covariances. The whole pipeline is traditionally rooted in some probabilistic framework providing both the likelihood for parameter estimation and the...
Conference Paper
Full-text available
This paper introduces a new method for multichannel speech enhancement based on a versatile modeling of the residual noise spectrogram. Such a model has already been presented before in the single channel case where the noise component is assumed to follow an alpha- stable distribution for each time-frequency bin, whereas the speech spectrogram, su...
Chapter
Full-text available
This paper introduces a new method for multichannel speech enhancement based on a versatile modeling of the residual noise spectrogram. Such a model has already been presented before in the single channel case where the noise component is assumed to follow an alpha-stable distribution for each time-frequency bin, whereas the speech spectrogram, sup...
Conference Paper
Full-text available
Source separation, which consists in decomposing data into meaningful structured components, is an active research topic in music signal processing. In this paper, we introduce the Positive α-stable (PαS) distributions to model the latent sources, which are a sub-class of the stable distributions family. They notably permit us to model random varia...
Conference Paper
Full-text available
In this paper, we focus on the problem of sound source localization and we propose a technique that exploits the known and arbitrary geometry of the microphone array. While most probabilistic techniques presented in the past rely on Gaussian models, we go further in this direction and detail a method for source localization that is based on the rec...
Conference Paper
Full-text available
We propose a probabilistic model for acoustic source localization with known but arbitrary geometry of the microphone array. The approach has several features. First, it relies on a simple nearfield acoustic model for wave propagation. Second, it does not require the number of active sources. On the contrary, it produces a heat map representing the...

Network

Cited By