Nicki Holighaus

Nicki Holighaus
  • PhD
  • Austrian Academy of Sciences (OeAW)

About

69
Publications
24,285
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,628
Citations
Current institution
Austrian Academy of Sciences (OeAW)
Additional affiliations
August 2018 - present
Austrian Academy of Sciences (OeAW)
Position
  • Research Associate
October 2013 - July 2018
Austrian Academy of Sciences (OeAW)
Position
  • Research Associate
August 2012 - October 2013
Austrian Academy of Sciences (OeAW)
Position
  • Researcher
Description
  • PhD student
Education
October 2010 - October 2013
University of Vienna
Field of study
  • Mathematics

Publications

Publications (69)
Conference Paper
Full-text available
Ultrasonic vocalizations (USV) convey information about individual identity and arousal status in mice. We propose to track USV as ridges in the time–frequency domain via a variant of time– frequency reassignment (TFR). The key idea is to perform TFR with empirical Wiener shrinkage and multitapering to improve robustness to noise. Furthermore, we p...
Article
Full-text available
Warped time-frequency systems have recently been introduced as a class of structured continuous frames for functions on the real line. Herein, we generalize this framework to the setting of functions of arbitrary dimensionality. After showing that the basic properties of warped time-frequency representations carry over to higher dimensions, we dete...
Preprint
Full-text available
In a recent paper, we have shown that warped time-frequency representations provide a rich framework for the construction and study of smoothness spaces matched to very general phase space geometries obtained by diffeomorphic deformations of $\mathbb{R}^d$. Here, we study these spaces, obtained through the application of general coorbit theory, usi...
Article
Full-text available
We provide the foundations of a Hilbert space theory for the short-time Fourier transform (STFT) where the flat tori T 2 N = R 2 /(Z × N Z) = [0, 1] × [0, N ] act as phase spaces. We work on an N-dimensional subspace S N of distributions periodic in time and frequency in the dual S 0 (R) of the Feichtinger algebra S 0 (R) and equip it with an inner...
Preprint
Full-text available
We provide the foundations of a Hilbert space theory for the short-time Fourier transform (STFT) where the flat tori T 2 N = R 2 /(Z × N Z) = [0, 1] × [0, N ] act as phase spaces. We work on an N-dimensional subspace S N of distributions periodic in time and frequency in the dual S 0 (R) of the Feichtinger algebra S 0 (R) and equip it with an inner...
Conference Paper
Full-text available
Personalised head-related transfer functions (HRTFs) represent a key component for applications in virtual or augmented reality with high demands in perceptual audio. Numerical methods enable the calculation of person-alised HRTFs based on the individual geometry of a listener. Such a geometry can be generated by exploiting the information from mul...
Preprint
Full-text available
We provide an example for the generating matrix $A$ of a two-dimensional lattice $\Gamma = A\mathbb{Z}^2$, such that the following holds: For any sufficiently smooth and localized mother wavelet $\psi$, there is a constant $\beta(A,\psi)>0$, such that $\beta\Gamma\cap (\mathbb{R}\times\mathbb{R}^+)$ is a set of stable sampling for the wavelet syste...
Preprint
The constant center frequency to bandwidth ratio (Q-factor) of wavelet transforms provides a very natural representation for audio data. However, invertible wavelet transforms have either required non-uniform decimation -- leading to irregular data structures that are cumbersome to work with -- or require excessively high oversampling with unaccept...
Article
The constant center frequency to bandwidth ratio (Q-factor) of wavelet transforms provides a very natural representation for audio data. However, invertible wavelet transforms have either required non-uniform decimation—leading to irregular data structures that are cumbersome to work with—or require excessively high oversampling with unacceptable c...
Preprint
Full-text available
Warped time-frequency systems have recently been introduced as a class of structured continuous frames for functions on the real line. Herein, we generalize this framework to the setting of functions of arbitrary dimensionality. After showing that the basic properties of warped time-frequency representations carry over to higher dimensions, we dete...
Preprint
Full-text available
Finding the best K-sparse approximation of a signal in a redundant dictionary is an NP-hard problem. Suboptimal greedy matching pursuit (MP) algorithms are generally used for this task. In this work, we present an acceleration technique and an implementation of the matching pursuit algorithm acting on a multi-Gabor dictionary, i.e., a concatenation...
Preprint
Full-text available
Signal reconstruction from magnitude-only measurements presents a long-standing problem in signal processing. In this contribution, we propose a phase (re)construction method for filter banks with uniform decimation and controlled frequency variation. The suggested procedure extends the recently introduced phase-gradient heap integration and relies...
Preprint
Full-text available
The phase vocoder (PV) is a widely spread technique for processing audio signals. It employs a short-time Fourier transform (STFT) analysis-modify-synthesis loop and is typically used for time-scaling of signals by means of using different time steps for STFT analysis and synthesis. The main challenge of PV used for that purpose is the correction o...
Preprint
Full-text available
The scattering transform is a non-linear signal representation method based on cascaded wavelet transform magnitudes. In this paper we introduce phase scattering, a novel approach where we use phase derivatives in a scattering procedure. We first revisit phase-related concepts for representing time-frequency information of audio signals, in particu...
Preprint
Full-text available
Audio inpainting refers to signal processing techniques that aim at restoring missing or corrupted consecutive samples in audio signals. Prior works have shown that $\ell_1$- minimization with appropriate weighting is capable of solving audio inpainting problems, both for the analysis and the synthesis models. These models assume that audio signals...
Article
Schur's test for integral operators states that if a kernel K:X×Y→C satisfies ∫Y|K(x,y)|dν(y)≤C and ∫X|K(x,y)|dμ(x)≤C, then the associated integral operator is bounded from Lp(ν) into Lp(μ), simultaneously for all p∈[1,∞]. We derive a variant of this result which ensures that the integral operator acts boundedly on the (weighted) mixed-norm Lebesgu...
Article
Finding the best K -sparse approximation of a signal in a redundant dictionary is an NP-hard problem. Suboptimal greedy matching pursuit algorithms are generally used for this task. In this work, we present an acceleration technique and an implementation of the matching pursuit algorithm acting on a multi-Gabor dictionary, i.e., a concatenation of...
Article
In audio processing applications, phase retrieval (PR) is often performed from the magnitude of short-time Fourier transform (STFT) coefficients. Although PR performance has been observed to depend on the considered STFT parameters and audio data, the extent of this dependence has not been systematically evaluated yet. To address this, we studied t...
Preprint
Full-text available
In audio processing applications, phase retrieval (PR) is often performed from the magnitude of short-time Fourier transform (STFT) coefficients. Although PR performance has been observed to depend on the considered STFT parameters and audio data, the extent of this dependence has not been systematically evaluated yet. To address this, we studied t...
Article
The papers from this special section focus on the restoration of udio content, in particular speech and music from degraded observations. This is a challenging and long-standing problem in audio processing. In particular this holds for severe degradations and incomplete observations, which are regularly encountered in practice. The papers in this s...
Article
We introduce GACELA, a conditional generative adversarial network (cGAN) designed to restore missing audio data with durations ranging between hundreds of milliseconds and a few seconds, i.e., to perform long-gap audio inpainting. While previous work either addressed shorter gaps or relied on exemplars by copying available information from other si...
Conference Paper
Full-text available
Matching pursuit (MP) algorithms are widely used greedy methods to find K-sparse signal approximations in redundant dictionaries. We present an acceleration technique and an implementation of the matching pursuit algorithm acting on a multi-Gabor dictionary , i.e., a concatenation of several Gabor-type time-frequency dictionaries, consisting of tra...
Preprint
Full-text available
Schur's test states that if $K:X\times Y\to\mathbb{C}$ satisfies $\int_Y |K(x,y)|d\nu(y)\leq C$ and $\int_X |K(x,y)|d\mu(x)\leq C$, then the associated integral operator acts boundedly on $L^p$ for all $p\in [1,\infty]$. We derive a variant of this result ensuring boundedness on the (weighted) mixed-norm Lebesgue spaces $L_w^{p,q}$ for all $p,q\in...
Preprint
Full-text available
We introduce GACELA, a generative adversarial network (GAN) designed to restore missing musical audio data with a duration ranging between hundreds of milliseconds to a few seconds, i.e., to perform long-gap audio inpainting. While previous work either addressed shorter gaps or relied on exemplars by copying available information from other signal...
Article
Full-text available
A method for constructing non-uniform filter banks is presented. Starting from a uniform system of translates, generated by a prototype filter, a non-uniform covering of the frequency axis is obtained by composition with a warping function. The warping function is a \({\mathcal {C}}^1\)-diffeomorphism that determines the frequency progression and c...
Article
Full-text available
We study the ability of deep neural networks (DNNs) to restore missing audio content based on its context, i.e., in-paint audio gaps. We focus on a condition which has not received much attention yet: gaps in the range of tens of milliseconds. We propose a DNN structure that is provided with the signal surrounding the gap in the form of time-freque...
Conference Paper
Full-text available
In this work, we present an algorithm for phaseless reconstruction from magnitude-only wavelet coefficients. The method relies on an explicit relation between the log-magnitude and phase gradients of analytic wavelet transforms and an extension of the Phase-Gradient Heap Integration (PGHI) algorithm recently introduced for Gabor phaseless reconstru...
Conference Paper
Full-text available
In this contribution, we present a heuristic method to invert the reassigned short time Fourier transform magnitude spectrum to allow the reconstruction of the original time domain signal. This is a simple method just involving an additional smearing step before phase retrieval. Finally, we provide some numerical evidence that our method combined w...
Poster
Full-text available
Time-frequency (TF) representations provide powerful and intuitive features for the analysis of time series such as audio. But still, generative modeling of audio in the TF domain is a subtle matter. Consequently, neural audio synthesis widely relies on directly modeling the waveform and previous attempts at unconditionally synthesizing audio from...
Article
Full-text available
We obtain a characterization of all wavelets leading to analytic wavelet transforms (WT). The characterization is obtained as a by-product of the theoretical foundations of a new method for wavelet phase reconstruction from magnitude-only coefficients. The cornerstone of our analysis is an expression of the partial derivatives of the continuous WT,...
Conference Paper
Full-text available
Time-frequency (TF) representations provide powerful and intuitive features for the analysis of time series such as audio. But still, generative modeling of audio in the TF domain is a subtle matter. Consequently, neural audio synthesis widely relies on directly modeling the waveform and previous attempts at unconditionally synthesizing audio from...
Conference Paper
We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting. We focused on gaps in the range of tens of milliseconds. The proposed DNN structure was trained on audio signals containing music and musical instruments, separately, with 64-ms long gaps. T...
Preprint
Full-text available
Time-frequency (TF) representations provide powerful and intuitive features for the analysis of time series such as audio. But still, generative modeling of audio in the TF domain is a subtle matter. Consequently, neural audio synthesis widely relies on directly modeling the waveform and previous attempts at unconditionally synthesizing audio from...
Preprint
Full-text available
We obtain a characterization of all wavelets leading to analytic wavelet transforms (WT), resulting in a new generalization of the Cauchy wavelets. The characterization is obtained as a by-product of the theoretical foundations of a new method for wavelet phase reconstruction from magnitude-only coefficients. The cornerstone of our analysis is an e...
Preprint
Full-text available
We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting. We focused on gaps in the range of tens of milliseconds, a condition which has not received much attention yet. The proposed DNN structure was trained on audio signals containing music and m...
Article
Full-text available
We present a novel family of continuous, linear time-frequency transforms adaptable to a multitude of (nonlinear) frequency scales. Similar to classical time-frequency or time-scale representations, the representation coefficients are obtained as inner products with the elements of a continuously indexed family of time-frequency atoms. These atoms...
Article
We present a novel method for the compensation of long duration data loss in audio signals, in particular music. The concealment of such signal defects is based on a graph that encodes signal structure in terms of time-persistent spectral similarity. A suitable candidate segment for the substitution of the lost content is proposed by an intuitive o...
Article
Full-text available
Featured Application The proposed framework is highly suitable for audio applications that require analysis–synthesis systems with the following properties: stability, perfect reconstruction, and a flexible choice of redundancy. Abstract Many audio applications rely on filter banks (FBs) to analyze, process, and re-synthesize sounds. For these app...
Conference Paper
Full-text available
The phase vocoder (PV) is a widely spread technique for processing audio signals. It employs a short-time Fourier transform (STFT) analysis-modify-synthesis loop and is typically used for time-scaling of signals by means of using different time steps for STFT analysis and synthesis. The main challenge of PV used for that purpose is the correction o...
Conference Paper
Full-text available
Signal reconstruction from magnitude-only measurements presents a long-standing problem in signal processing. In this contribution we propose a phase (re)construction method for filter banks with uniform decimation and controlled frequency variation. The suggested procedure extends the recently introduced phase-gradient heap integration and relies...
Chapter
This review chapter aims to strengthen the link between frame theory and signal processing tasks in psychoacoustics. On the one side, the basic concepts of frame theory are presented and some proofs are provided to explain those concepts in some detail. The goal is to reveal to hearing scientists how this mathematical theory could be relevant for t...
Article
During the process of writing the manuscript ["Continuous warped time-frequency representations - Coorbit spaces and discretization", N. Holighaus, C. Wiesmeyr and P. Balazs], the work ["Continuous Frames, Function Spaces and the Discretization Problem" by M. Fornasier and H. Rauhut - (1)] was one of the major foundations of our results and, natura...
Article
Full-text available
This review chapter aims to strengthen the link between frame theory and signal processing tasks in psychoacoustics. On the one side, the basic concepts of frame theory are presented and some proofs are provided to explain those concepts in some detail. The goal is to reveal to hearing scientists how this mathematical theory could be relevant for t...
Article
Full-text available
We present a novel method for the compensation of long duration data loss in audio signals, in particular music. The concealment of such signal defects is based on a graph that encodes signal structure in terms of time-persistent spectral similarity. A suitable candidate segment for the substitution of the lost content is proposed by an intuitive o...
Article
Full-text available
Many audio applications rely on filter banks (FBs) to analyze, process, and re-synthesize sounds. To approximate the auditory frequency resolution in the signal chain, some applications rely on perceptually motivated FBs, the gammatone FB being a popular example. However, most perceptually motivated FBs only allow partial signal reconstruction at h...
Article
In this contribution, we extend the reassignment method (RM) and synchrosqueezing transform (SST) to arbitrary time-frequency localized filters and, in the first case, arbitrary decimation factors. A sufficient condition for the invertibility of the SST is provided. RM and SST are techniques for deconvolution of short-time Fourier and wavelet repre...
Article
The fixed time-frequency resolution of the short-time Fourier transform has often been considered a major drawback. In this contribution we review recent results on a class of time-frequency transforms that adapt to a large class of frequency scales in the same sense that wavelet transforms are adapted to a logarithmic scale. In particular, we show...
Article
Full-text available
We present a novel family of continuous linear time-frequency transforms adapted to a multitude of (nonlinear) frequency scales. Similar to classical time-frequency or time-scale representations, the representation coefficients are obtained as inner products with the elements of a continuously indexed family of time-frequency atoms. These atoms are...
Conference Paper
Full-text available
The Large Time Frequency Analysis Toolbox (LTFAT) is a modern Octave/Matlab toolbox for time-frequency analysis, synthesis, coefficient manipulation and visualization. It’s purpose is to serve as a tool for achieving new scientific developments as well as an educational tool. The present paper introduces main features of the second major release of...
Article
Full-text available
We deduce a simple representation and the invariant factor decompositions of the subgroups of the group Z_m x Z_n, where m and n are arbitrary positive integers. We obtain formulas for the total number of subgroups and the number of subgroups of a given order.
Article
A flexible method for constructing nonuniform filter banks is presented. Starting from a uniform system of translates, a nonuniform covering of the frequency axis is obtained by nonlinear evaluation. The frequency progression is determined by a warping function that can be chosen freely apart from some minor restrictions. The resulting functions ar...
Article
Full-text available
Redundant Gabor frames admit an infinite number of dual frames, yet only the canonical dual Gabor system, con- structed from the minimal l2-norm dual window, is widely used. This window function however, might lack desirable properties, such as good time-frequency concentration, small support or smoothness. We employ convex optimization methods to...
Conference Paper
Full-text available
In this paper, we propose a time-frequency representation where the frequency bins are distributed uniformly in log-frequency and their Q-factors obey a linear function of the bin center frequencies. The latter allows for time-frequency representations where the bandwidths can be e.g. constant on the log-frequency scale (constant Q) or constant on...
Article
Full-text available
We consider the problem of designing spectral graph filters for the construction of dictionaries of atoms that can be used to efficiently represent signals residing on weighted graphs. While the filters used in previous spectral graph wavelet constructions are only adapted to the length of the spectrum, the filters proposed in this paper are adapte...
Conference Paper
Full-text available
This paper describes a method for obtaining a perceptually motivated and perfectly invertible time-frequency representation of a sound signal. Based on frame theory and the recent non-stationary Gabor transform, a linear representation with resolution evolving across frequency is formulated and implemented as a non-uniform filterbank. To match the...
Article
Full-text available
The Discrete Gabor Transform (DGT) is the most commonly used transform for signal analysis and synthesis using a linear frequency scale. It turns out that the involved operators are rich in structure if one samples the discrete phase space on a subgroup. Most of the literature focuses on separable subgroups, in this paper we will survey existing me...
Conference Paper
Full-text available
Redundant Gabor frames admit an infinite number of dual frames, yet only the canonical dual Gabor system, constructed from the minimal 2-norm dual window, is widely used. This window function however, might lack desirable properties, such as good time-frequency concentration, small support or smoothness. We employ convex optimization methods to des...
Article
Full-text available
We investigate the structural properties of dual systems for nonstationary Gabor frames. In particular, we prove that some inverse nonstationary Gabor frame operators admit a Walnut-like representation, i.e. the operator acting on a function can be described by weighted translates of that function, even when the original frame operator is not diago...
Article
Full-text available
We discuss properties of the subgroups of the group Z_m × Z_n , where m and n are arbitrary positive integers. Simple formulae for the total number of subgroups and the number of subgroups of a given order are deduced. The cyclic subgroups and subgroups of a given exponent are also considered.
Article
Full-text available
Audio signal processing frequently requires time-frequency representations and in many applications, a non-linear spacing of frequency-bands is preferable. This paper introduces a framework for efficient implementation of invertible signal transforms allowing for non-uniform and in particular non-linear frequency resolution. Non-uniformity in frequ...
Conference Paper
Full-text available
In this paper the potential of using nonstationary Gabor transform for beat tracking in music is examined. Nonstationary Gabor transforms are a generalization of the short-time Fourier transform, which allow flexibility in choosing the number of bins per octave, while retaining a perfect inverse transform. In this paper, it is evaluated if these pr...
Article
Full-text available
Signal analysis with classical Gabor frames leads to a fixed time-frequency resolution over the whole time-frequency plane. To overcome the limitations imposed by this rigidity, we propose an extension of Gabor theory that leads to the construction of frames with time-frequency resolution changing over time or frequency. We describe the constructio...
Conference Paper
Full-text available
An efficient and perfectly invertible signal transform featuring a constant-Q frequency resolution is presented. The proposed ap- proach is based on the idea of the recently introduced nonstation- ary Gabor frames. Exploiting the properties of the operator corre- sponding to a family of analysis atoms, this approach overcomes the problems of the cl...

Network

Cited By