About
274
Publications
109,289
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
74,118
Citations
Introduction
Skills and Expertise
Additional affiliations
January 1990 - present
Publications
Publications (274)
Single-cell data integration can provide a comprehensive molecular view of cells, and many algorithms have been developed to remove unwanted technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage, existing methods suffer from several fundamental limitations. In particular, we lack a rigorous st...
Single-cell data integration can provide a comprehensive molecular view of cells, and many algorithms have been developed to remove unwanted technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage, existing methods suffer from several fundamental limitations. In particular, we lack a rigorous st...
Although the concepts of nonuniform sampling (NUS ) and non-Fourier spectral reconstruction in multidimensional NMR began to emerge 4 decades ago , it is only relatively recently that NUS has become more commonplace. Advantages of NUS include the ability to tailor experiments to reduce data collection time and to improve spectral quality, whether t...
Although the concepts of non-uniform sampling (NUS) and non-Fourier spectral reconstruction in multidimensional NMR began to emerge four decades ago (Bodenhausen and Ernst, 1981; Barna and Laue, 1987), it is only relatively recently that NUS has become more commonplace. Advantages of NUS include the ability to tailor experiments to reduce data coll...
Consider a multiple hypothesis testing setting involving rare/weak features: only few features, out of possibly many, deviate from their null hypothesis behavior. Summarizing the significance of each feature by a P-value, we construct a global test against the null using the Higher Criticism (HC) statistics of these P-values. We calibrate the rare/...
We derive a formula for optimal hard thresholding of the singular value decomposition in the presence of correlated additive noise; although it nominally involves unobservables, we show how to apply it even where the noise covariance structure is not a-priori known or is not independently estimable. The proposed method, which we call ScreeNOT, is a...
Given two samples from possibly different discrete distributions over a common set of size $N$, consider the problem of testing whether these distributions are identical, vs. the following rare/weak perturbation alternative: the frequencies of $N^{1-\beta}$ elements are perturbed by $r(\log N)/2n$ in the Hellinger distance, where $n$ is the size of...
Modern data science research can involve massive computational experimentation; an ambitious PhD in computational fields may do experiments consuming several million CPU hours. Traditional computing practices, in which researchers use laptops or shared campus-resident resources, are inadequate for experiments at the massive scale and varied scope t...
We study estimation of the covariance matrix under relative condition number loss $\kappa(\Sigma^{-1/2} \hat{\Sigma} \Sigma^{-1/2})$, where $\kappa(\Delta)$ is the condition number of matrix $\Delta$, and $\hat{\Sigma}$ and $\Sigma$ are the estimated and theoretical covariance matrices. Optimality in $\kappa$-loss provides optimal guarantees in two...
Recovering images from undersampled linear measurements typically leads to an ill-posed linear inverse problem, that asks for proper statistical priors. Building effective priors is however challenged by the low train and test overhead dictated by real-time tasks; and the need for retrieving visually "plausible" and physically "feasible" images wit...
More than 50 years ago, John Tukey called for a reformation of academic statistics. In “The Future of Data Analysis,” he pointed to the existence of an as-yet unrecognized science, whose subject of interest was learning from data, or “data analysis.” Ten to 20 years ago, John Chambers, Jeff Wu, Bill Cleveland, and Leo Breiman independently once aga...
We study anisotropic undersampling schemes like those used in multi-dimensional NMR spectroscopy and MR imaging , which sample exhaustively in certain time dimensions and randomly in others. Our analysis shows that anisotropic undersampling schemes are equivalent to certain block-diagonal measurement systems. We develop novel exact formulas for the...
In NMR spectroscopy, undersampling in the indirect dimensions causes reconstruction artifacts whose size can be bounded using the so-called {\it coherence}. In experiments with multiple indirect dimensions, new undersampling approaches were recently proposed: random phase detection (RPD) \cite{Maciejewski11} and its generalization, partial componen...
The increasing availability of access to large-scale computing clusters-for example via the cloud-is changing the way scientific research can be conducted, enabling experiments of a scale and scope that would have been inconceivable several years ago. An ambitious data scientist today can carry out projects involving several million CPU hours. In t...
In a recent article (Proc. Natl. Acad. Sci., 110(36), 14557-14562), El Karoui
et al. study the distribution of robust regression estimators in the regime in
which the number of parameters $p$ is of the same order as the number of
samples $n$. Using numerical simulations and `highly plausible' heuristic
arguments, they unveil a striking new phenomen...
A half century ago, Huber evaluated the minimax asymptotic variance in scalar
location estimation, $ \min_\psi \max_{F \in {\cal F}_\epsilon} V(\psi, F) =
\frac{1}{I(F_\epsilon^*)} $, where $V(\psi,F)$ denotes the asymptotic variance
of the $(M)$-estimator for location with score function $\psi$, and
$I(F_\epsilon^*)$ is the minimal Fisher informat...
In modern high-throughput data analysis, researchers perform a large number
of statistical tests, expecting to find perhaps a small fraction of significant
effects against a predominantly null background. Higher Criticism (HC) was
introduced to determine whether there are any non-zero effects; more recently,
it was applied to feature selection, whe...
We consider recovery of low-rank matrices from noisy data by shrinkage of
singular values, in which a single, univariate nonlinearity is applied to each
of the empirical singular values. We adopt an asymptotic framework, in which
the matrix size is much larger than the rank of the signal matrix to be
recovered, and the signal-to-noise ratio of the...
Since the seminal work of Stein (1956) it has been understood that the
empirical covariance matrix can be improved by shrinkage of the empirical
eigenvalues. In this paper, we consider a proportional-growth asymptotic
framework with $n$ observations and $p_n$ variables having limit $p_n/n \to
\gamma \in (0,1]$. We assume the population covariance m...
Consider the compressed sensing problem of estimating an unknown k-sparse n-vector from a set of m noisy linear equations. Recent work focused on the noise sensitivity of particular algorithms - the scaling of the reconstruction error with added noise. In this paper, we study the minimax noise sensitivity - the minimum is over all possible recovery...
Recent work on Approximate Message Passing algorithms in compressed sensing focuses on 'ideal' algorithms which at each iteration face a subproblem of recovering an unknown sparse signal in Gaussian white noise. The noise level in each subproblem changes from iteration to iteration in a way that depends on the underlying signal (which we don't know...
We consider recovery of low-rank matrices from noisy data by hard
thresholding of singular values, where singular values below a prescribed
threshold \lambda are set to 0. We study the asymptotic MSE in a framework
where the matrix size is large compared to the rank of the matrix to be
recovered, and the signal-to-noise ratio of the low-rank piece...
Let $X_0$ be an unknown $M$ by $N$ matrix. In matrix recovery, one takes $n <
MN$ linear measurements $y_1,..., y_n$ of $X_0$, where $y_i = \Tr(a_i^T X_0)$
and each $a_i$ is a $M$ by $N$ matrix. For measurement matrices with Gaussian
i.i.d entries, it known that if $X_0$ is of low rank, it is recoverable from
just a few measurements. A popular appr...
An unknown $m$ by $n$ matrix $X_0$ is to be estimated from noisy measurements
$Y = X_0 + Z$, where the noise matrix $Z$ has i.i.d Gaussian entries. A popular
matrix denoising scheme solves the nuclear norm penalization problem $\min_X ||
Y - X ||_F^2/2 + \lambda ||X||_* $, where $ ||X||_*$ denotes the nuclear norm
(sum of singular values). This is...
A new system allows researchers to discover, reuse, cite, and experiment upon any computational result that is published with a Verifiable Result Identifier.
In compressed sensing, one takes $$\mathit{n} samples of an N-dimensional vector $${\mathit{x}}_{\mathbf{0}}$$ using an $$\mathit{n}\mathbf{\times }\mathit{N}$$ matrix A, obtaining undersampled measurements $$\mathit{y}=\mathit{A}{\mathit{x}}_{\mathbf{0}}$$. For random matrices with independent standard Gaussian entries, it is known that, when $${\...
Verifiable Computational Results (VCR) is a disciplined approach to computer-based research that requires subtle adjustment to the work habits of scientists, and in return automatically converts their results into permanent Web services. This article describes how VCR makes computational results accessible to three specific Dream Applications, whic...
Many papers studying compressed sensing consider the noisy underdetermined system of linear equations: y = Ax0+ z, with n × N measurement matrix A, n ; δ*(ε), can we have exact recovery even in the noiseless-data strict-sparsity setting. It turns out that the minimax AMSE can be characterized succinctly by a coefficient sensp*(ε, δ) which we refer...
Consider a d n matrix A, with d < n. The problem of solving for x in y = Ax is underdetermined, and has infinitely many solutions (if there are any). In several applications it is of interest to find the sparsest solution - the one with fewest non- zeros. Donoho (2004) showed that if x is sparse enough then it is the unique solution to the optimiza...
Finding the sparsest solution to underdetermined systems of linear equations y = φx is NP-hard in general. We show here that for systems with "typical"/"random" φ, a good approximation to the sparsest solution is obtained by applying a fixed number of standard operations from linear algebra. Our proposal, Stagewise Orthogonal Matching Pursuit (StOM...
We study the compressed sensing reconstruction problem for a broad class of
random, band-diagonal sensing matrices. This construction is inspired by the
idea of spatial coupling in coding theory. As demonstrated heuristically and
numerically by Krzakala et al. \cite{KrzakalaEtAl}, message passing algorithms
can effectively solve the reconstruction...
Compressed sensing posits that, within limits, one can undersample a sparse
signal and yet reconstruct it accurately. Knowing the precise limits to such
undersampling is important both for theory and practice. We present a formula
that characterizes the allowed undersampling of generalized sparse objects. The
formula applies to Approximate Message...
Consider the noisy underdetermined system of linear equations: y=Ax0 + z0,
with n x N measurement matrix A, n < N, and Gaussian white noise z0 ~
N(0,\sigma^2 I). Both y and A are known, both x0 and z0 are unknown, and we
seek an approximation to x0. When x0 has few nonzeros, useful approximations
are obtained by l1-penalized l2 minimization, in whi...
We consider the compressed sensing problem where the object x<sub>0</sub> ∈ ℝ<sup>N</sup> is to be recovered from incomplete measurements y = Ax<sub>0</sub>+z. Here the sensing matrix A is an n×N random matrix with Gaussian entries and n <; N. A popular method of sparsity-promoting reconstruction is ℓ<sup>1</sup>-penalized least-squares reconstruct...
We consider the compressed sensing problem, where the object $x_0 \in \bR^N$
is to be recovered from incomplete measurements $y = Ax_0 + z$; here the
sensing matrix $A$ is an $n \times N$ random matrix with iid Gaussian entries
and $n < N$. A popular method of sparsity-promoting reconstruction is
$\ell^1$-penalized least-squares reconstruction (aka...
We present a discipline for verifiable computational scienti_c research. Our discipline revolves around three simple new concepts — verifiable computational result (VCR), VCR repository and Verifiable Result Identifier (VRI). These are web- and cloud-computing oriented concepts, which exploit today's web infrastructure to achieve standard, simple a...
Three-dimensional volumetric data are becoming increasingly available in a wide range of scientific and technical disciplines. With the right tools, we can expect such data to yield valuable insights about many important phenomena in our three-dimensional world. In this paper, we develop tools for the analysis of 3-D data which may contain structur...
Undersampling theorems state that we may gather far fewer samples than the usual sampling theorem while exactly reconstructing the object of interest-provided the object in question obeys a sparsity condition, the samples measure appropriate linear combinations of signal values, and we reconstruct with a particular nonlinear procedure. While there...
This special issue presents many exciting and surprising developments in signal and image processing owing to sparse representations. While the technology is new, much of the intellectual heritage of these recent developments can be traced back generations. This foreword sketches the ??prehistory of sparsity,?? a fascinating range of early discover...
1. I NTRODUCTION I am genuinely thrilled to see Biostatistics make a formal venture into computational reproducibility, and I congratulate the editors of Biostatistics on taking this much needed step. I find the policies being adopted by Biostatistics eminently practical, and I hope that many authors will begin using this option. In my comments, I...
We conducted an extensive computational experiment, lasting multiple CPU-years, to optimally select parameters for two important classes of algorithms for finding sparse solutions of underdetermined systems of linear equations. We make the optimally tuned implementations available at sparselab.stanford.edu; they run ??out of the box?? with no user...
In "Counting faces of randomly projected polytopes when the projection radically lowers dimension " the authors proved an asymptotic sampling theorem for sparse signals, showing that n random measurements permit to reconstruct an N-vector having k nonzeros provided n > 2 · k-log(N/n)(1 + o(1)) reconstruction uses ¿<sub>1</sub> minimization. They...
In a recent paper, the authors proposed a new class of low-complexity iterative thresholding algorithms for reconstructing sparse signals from a small set of linear measurements. The new algorithms are broadly referred to as AMP, for approximate message passing. This is the first of two conference papers describing the derivation of these algorithm...
In a recent paper, the authors proposed a new class of low-complexity iterative thresholding algorithms for reconstructing sparse signals from a small set of linear measurements \cite{DMM}. The new algorithms are broadly referred to as AMP, for approximate message passing. This is the first of two conference papers describing the derivation of thes...
In a recent paper, the authors proposed a new class of low-complexity iterative thresholding algorithms for reconstructing sparse signals from a small set of linear measurements \cite{DMM}. The new algorithms are broadly referred to as AMP, for approximate message passing. This is the second of two conference papers describing the derivation of the...
We review connections between phase transitions in high-dimensional
combinatorial geometry and phase transitions occurring in modern
high-dimensional data analysis and signal processing. In data analysis, such
transitions arise as abrupt breakdown of linear model selection, robust data
fitting or compressed sensing reconstructions, when the complex...
We consider two-class linear classification in a high-dimensional, low-sample
size setting. Only a small fraction of the features are useful, the useful
features are unknown to us, and each useful feature contributes weakly to the
classification decision -- this setting was called the rare/weak model (RW
Model). We select features by thresholding f...
Compressed sensing aims to undersample certain high-dimensional signals yet accurately reconstruct them by exploiting signal characteristics. Accurate reconstruction is possible when the object to be recovered is sufficiently sparse in a known basis. Currently, the best known sparsity-undersampling tradeoff is achieved when reconstructing by convex...
Astronomical images of galaxies can be modeled as a superposition of pointlike and curvelike structures. Astronomers typically face the problem of extracting those components as accurate as possible. Although this problem seems unsolvable – as there are two unknowns for every datum – suggestive empirical results have been achieved by employing a di...
We conducted an extensive computational experiment, lasting multiple CPU-years, to optimally select parameters for important classes of algorithms for finding sparse solutions of underdetermined systems of linear equations. We make the optimally tuned implementations freely available at sparselab.stanford.edu; they can be used 'out of the box' with...
Abstract A full-rank matrix A 2 IR,£m,with n < m,generates an underdetermined system of linear equations Ax = b having inflnitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinatorial in nature, are there e‐cient methods...
Scientific computation is emerging as absolutely central to the scientific method. Unfortunately, it's error-prone and currently immature-traditional scientific publication is incapable of finding and rooting out errors in scientific computation-which must be recognized as a crisis. An important recent development and a necessary response to the cr...
The minimum lscr<sub>1</sub>-norm solution to an underdetermined system of linear equations y = Ax is often, remarkably, also the sparsest solution to that system. This sparsity-seeking property is of interest in signal processing and information transmission. However, general-purpose optimizers are much too slow for lscr<sub>1</sub> minimization i...
In important application fields today—genomics and proteomics are examples—selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2,...
Let $A$ be an $n$ by $N$ real valued random matrix, and $\h$ denote the $N$-dimensional hypercube. For numerous random matrix ensembles, the expected number of $k$-dimensional faces of the random $n$-dimensional zonotope $A\h$ obeys the formula $E f_k(A\h) /f_k(\h) = 1-P_{N-n,N-k}$, where $P_{N-n,N-k}$ is a fair-coin-tossing probability. The formul...
BeamLab is a collection of Matlab functions that have been used by the au- thors and collaborators to implement a variety of computational algorithms related to beamlet, curvelet and ridgelet analysis. The library is available free of charge over the Internet. Versions are provided for Macintosh, UNIX and Windows operating systems. Downloading and...
We used digital image processing and statistical clustering algorithms to segment and classify brush strokes in master paintings based on two-dimensional space and three-dimensional chromaticity coordinates. For works executed in sparse overlapping brush strokes our algorithm identifies candidate clusters of brush strokes of the top (most visible)...
Abstract The Radon transform is a fundamental tool in many,areas. For ex- ample, in reconstruction of an image from its projections (CT scanning). Although it is situated in the core of many modern physical computations, the Radon transform lacks a coherent discrete definition for 2D discrete images which is algebraically exact, invertible, and rap...
Abstract Computing the Fourier transform of a function in polar coordinates is an important building block in many,scientific disciplines and numerical schemes. In this paper we present the pseudo-polar Fourier transform that samples the Fourier transform on the pseudo-polar grid, also known as the concentric squares grid. The pseudo-polar grid con...
The sparsity which is implicit in MR images is exploited to significantly undersample k-space. Some MR images such as angiograms are already sparse in the pixel representation; other, more complicated images have a sparse representation in some transform domain-for example, in terms of spatial finite-differences or their wavelet coefficients. Accor...
Iterative thresholding algorithms have a long history of application to signal processing. Although they are intuitive and easy to implement, their development was heuristic and mainly ad hoc. Using a special form of the thresholding operation, called soft thresholding, we show that the fixed point of iterative thresholding is equivalent to minimum...
We develop a wavelet transform on the sphere, based on the spherical HEALPix coordinate system (Hierarchical Equal Area iso-Latitude Pixelization). HEALPix is heavily used for astronomical data processing applications; it is intrinsically multiscale and locally euclidean, hence appealing for building multiscale system. Furthermore, the equal-area p...
Consider a d*n matrix A, with d<n. The problem of solving for x in y=Ax is underdetermined, and has infinitely many solutions (if there are any). Given y, the minimum Kolmogorov complexity solution (MKCS) of the input x is defined to be an input z (out of many) with minimum Kolmogorov-complexity that satisfies y=Az. One expects that if the actual i...
We applied the Virtual Northern technique to human brain mRNA to systematically measure human mRNA transcript lengths on a genome-wide scale.
We used separation by gel electrophoresis followed by hybridization to cDNA microarrays to measure 8,774 mRNA transcript lengths representing at least 6,238 genes at high (>90%) confidence. By comparing these...
Many statistical methods have been proposed in the last years for analysing the spatial distribution of galaxies. Very few of them, however, can handle properly the border effects of complex observational sample volumes. In this paper, we first show how to calculate the Minkowski Functionals (MFs) taking into account these border effects. We then p...
Image processing researchers commonly assert that "median filtering is better than linear filtering for removing noise in the presence of edges." Using a straightforward large-$n$ decision-theory framework, this folk-theorem is seen to be false in general. We show that median filtering and linear filtering have similar asymptotic worst-case mean-sq...
This paper describes two digital implementations of a new mathematical transform, namely, the second generation curvelet transform in two and three dimensions. The first digital transformation is based on unequally spaced fast Fourier transforms, while the second is based on the wrapping of specially selected Fourier samples. The two implementation...
Let Q = Q N Q = Q_N denote either the N N -dimensional cross-polytope C N C^N or the N − 1 N-1 -dimensional simplex T N − 1 T^{N-1} . Let A = A n , N A = A_{n,N} denote a random orthogonal projector A : R N ↦ b R n A: \mathbf {R}^{N} \mapsto bR^n . We compare the number of faces f k ( A Q ) f_k(AQ) of the projected polytope A Q AQ to the number of...
We consider inexact linear equations y ≈ Φ x where y is a given vector in ℝ n , Φ is a given n × m matrix, and we wish to find x 0,ϵ as sparse as possible while obeying ‖ y − Φ x 0,ϵ ‖ 2 ≤ ϵ. In general, this requires combinatorial optimization and so is considered intractable. On the other hand, the 𝓁 1 ‐minimization problem is convex and is consi...
Many applications in signal processing lead to the optimization problems min parxpar<sub>1</sub> subject to y = Ax, and min parxpar<sub>1</sub> subject to pary - Axpar les epsi, where A is a given d times n matrix, d < n, and y is a given n times 1 vector. In this work we consider l<sub>1</sub> minimization by using LARS, Lasso, and homotopy method...
We are given a set of $n$ points that might be uniformly distributed in the unit square $[0,1]^2$. We wish to test whether the set, although mostly consisting of uniformly scattered points, also contains a small fraction of points sampled from some (a priori unknown) curve with $C^{\alpha}$-norm bounded by $\beta$. An asymptotic detection threshold...
We consider linear equations y = Φx where y is a given vector in ℝn and Φ is a given n × m matrix with n < m ≤ τn, and we wish to solve for x ∈ ℝm. We suppose that the columns of Φ are normalized to the unit 2-norm, and we place uniform measure on such Φ. We prove the existence of ρ = ρ(τ) > 0 so that for large n and for all Φ's except a negligible...
Let A be a d by n matrix, d < n. Let C be the regular cross polytope (octahedron) in Rn. It has recently been shown
that properties of the centrosymmetric polytope P = AC are of interest for finding
sparse solutions to the underdetermined system of equations y = Ax [9]. In particular, it is valuable to know that P is centrally k-neighborly. We stud...