[show abstract][hide abstract] ABSTRACT: We propose the audio inpainting framework that recovers portions of audio data distorted due to impairments such as impulsive noise, clipping, and packet loss. In this framework, the distorted data are treated as missing and their location is assumed to be known. The signal is decomposed into overlapping time-domain frames and the restoration problem is then formulated as an inverse problem per audio frame. Sparse representation modeling is employed per frame, and each inverse problem is solved using the Orthogonal Matching Pursuit algorithm together with a discrete cosine or a Gabor dictionary. The Signal-to-Noise Ratio performance of this algorithm is shown to be comparable or better than state-of-the-art methods when blocks of samples of variable durations are missing. We also demonstrate that the size of the block of missing samples, rather than the overall number of missing samples, is a crucial parameter for high quality signal restoration. We further introduce a constrained Matching Pursuit approach for the special case of audio declipping that exploits the sign pattern of clipped audio samples and their maximal absolute value, as well as allowing the user to specify the maximum amplitude of the signal. This approach is shown to outperform state-of-the-art and commercially available methods for audio declipping in terms of Signal-to-Noise Ratio.
IEEE Transactions on Audio Speech and Language Processing 04/2012; · 1.68 Impact Factor
[show abstract][hide abstract] ABSTRACT: Underwater acoustic imaging is traditionally performed with beamforming: beams are formed at emission to insonify limited angular regions; beams are (synthetically) formed at reception to form the image. We propose to exploit a natural sparsity prior to perform 3D underwater imaging using a newly built flexible-configuration sonar device. The computational challenges raised by the high-dimensionality of the problem are highlighted, and we describe a strategy to overcome them. As a proof of concept, the proposed approach is used on real data acquired with the new sonar to obtain an image of an underwater target. We discuss the merits of the obtained image in comparison with standard beamforming, as well as the main challenges lying ahead, and the bottlenecks that will need to be solved before sparse methods can be fully exploited in the context of underwater compressed 3D sonar imaging.
IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP 2012); 03/2012
[show abstract][hide abstract] ABSTRACT: We aim to assess the perceived quality of estimated source signals in the context of audio source separation. These signals may involve one or more kinds of distortions, including distortion of the target source, interference from the other sources or musical noise artifacts. We propose a subjective test protocol to assess the perceived quality with respect to each kind of distortion and collect the scores of 20 subjects over 80 sounds. We then propose a family of objective measures aiming to predict these subjective scores based on the decomposition of the estimation error into several distortion components and on the use of the PEMO-Q perceptual salience measure to provide multiple features that are then combined. These measures increase correlation with subjective scores up to 0.5 compared to nonlinear mapping of individual state-of-the-art source separation measures. Finally, we released the data and code presented in this paper in a freely-available toolkit called PEASS.
IEEE Transactions on Audio Speech and Language Processing 10/2011; · 1.68 Impact Factor
[show abstract][hide abstract] ABSTRACT: We present a novel sparse representation based approach for the restoration of clipped audio signals. In the proposed approach, the clipped signal is decomposed into overlapping frames and the declipping problem is formulated as an inverse problem, per audio frame. This problem is further solved by a constrained matching pursuit algorithm, that exploits the sign pattern of the clipped samples and their maximal absolute value. Performance evaluation with a collection of music and speech signals demonstrate superior results compared to existing algorithms, over a wide range of clipping levels.
Acoustics, Speech and Signal Processing, IEEE International Conference on (ICASSP 2011). 01/2011;
[show abstract][hide abstract] ABSTRACT: This paper investigated a new scheme for single-sensor audio source separation. This framework is introduced comparatively to the existing Gaussian mixture model generative approach and is focusing on the mixture states rather than on the source states, resulting in a discrete, joint state discriminant approach. The study establishes the theoretical performance bounds of the proposed scheme and an actual source separation system is designed. The performance is computed on a set of musical recordings and a discussion is proposed, including the question of the source correlation and the possible drawbacks of the method.
[show abstract][hide abstract] ABSTRACT: Dans cet article, des bornes de performances oracles sont déterminées pour la séparation de sources monocapteur sous contrainte d'un nombre fini d'états discrets. En fixant des contraintes qui sont à la base de systèmes existants, les bornes de performances obtenues sont plus réalistes qu'avec une contrainte de masquage temps-fréquence seule. Dans ce contexte, l'efficacité théorique des approches par mélanges de gaussiennes est quantifiée et comparée à des résultats provenant d'un système de l'état de l'art. De futures approches sont envisagées en faisant évoluer ces modèles vers des méthodes discriminantes à états conjoints.
[show abstract][hide abstract] ABSTRACT: In this paper, we address the problem of assessing the perceived quality of estimated source signals in the context of audio source separation. These signals involve different kinds of distortions depending on the considered separation algorithm, including distortion of the target source, interference from other sources or musical noise artifacts. A new MUSHRA-based subjective test protocol is proposed to assess the perceived quality with respect to each kind of distortion and collect the scores of 20 subjects over 80 sounds. Subsequently, the contribution of each type of distortion to the overall quality is analyzed. We propose a family of objective measures aiming to predict the subjective scores based on a decomposition of the estimation error into several distortion components. We conclude by discussing possible implications of this work in the field of 3D audio quality assessment.
AES 38th International Conference on Sound Quality Evaluation.