Amit Bermanis's research while affiliated with University of Toronto and other places

Publications (14)

Article
Dimensionality reduction methods are designed to overcome the ‘curse of dimensionality’ phenomenon that makes the analysis of high dimensional big data difficult. Many of these methods are based on principal component analysis which is statistically driven and do not directly address the geometry of the data. Thus, machine learning tasks, such as c...
Article
High-dimensional big data appears in many research fields such as image recognition, biology and collaborative filtering. Often, the exploration of such data by classic algorithms is encountered with difficulties due to `curse of dimensionality' phenomenon. Therefore, dimensionality reduction methods are applied to the data prior to its analysis. M...
Article
Full-text available
We present a kernel based method that learns from a small neighborhoods (patches) of multidimensional data points. This method is based on spectral decomposition of a large structured kernel accompanied by an out-of-sample extension method. In many cases, the performance of a spectral based learning mechanism is limited due to the use of a distance...
Article
Diffusion Maps framework is a kernel based method for manifold learning and data analysis that defines diffusion similarities by imposing a Markovian process on the given dataset. Analysis by this process uncovers the intrinsic geometric structures in the data. Recently, it was suggested to replace the standard kernel by a measure-based kernel that...
Article
Dimensionality reduction methods are very common in the field of high dimensional data analysis. Typically, algorithms for dimensionality reduction are computationally expensive. Therefore, their applications for the analysis of massive amounts of data are impractical. For example, repeated computations due to accumulated data are computationally p...
Article
Diffusion-based kernel methods are commonly used for analyzing massive high dimensional datasets. These methods utilize a non-parametric approach to represent the data by using an affinity kernel that represents similarities, distances or correlations between data points. The kernel is based on a Markovian diffusion process, whose transition probab...
Article
Particle filter is a powerful tool for state tracking using non-linear observations. We present a multiscale based method that accelerates the tracking computation by particle filters. Unlike the conventional way, which calculates weights over all particles in each cycle of the algorithm, we sample a small subset from the source particles using mat...
Article
The diffusion maps framework is a kernel-based method for manifold learning and data analysis that models a Markovian process over data. Analysis of this process provides meaningful information concerning inner geometric structures in the data. Recently, it was suggested to replace the standard kernel by a measure-based kernel, which incorporates i...
Article
Diffusion Maps (DM), and other kernel methods, are utilized for the analysis of high dimensional datasets. The DM method uses a Markovian diffusion process to model and analyze data. A spectral analysis of the DM kernel yields a map of the data into a low dimensional space, where Euclidean distances between the mapped data points represent the diff...
Article
A popular approach for analyzing high-dimensional datasets is to perform dimensionality reduction by applying non-parametric affinity kernels. Usually, it is assumed that the represented affinities are related to an underlying low-dimensional manifold from which the data is sampled. This approach works under the assumption that, due to the low-dime...
Article
We introduce a multiscale scheme for sampling scattered data and extending functions defined on the sampled data points, which overcomes some limitations of the Nyström interpolation method. The multiscale extension (MSE) method is based on mutual distances between data points. It uses a coarse-to-fine hierarchy of the multiscale decomposition of a...
Conference Paper
We present a method that accelerates the Particle Filter computation. Particle Filter is a powerful method for tracking the state of a target based on non-linear observations. Unlike the conventional way of calculating weights over all particles in each run, we sample a small subset of the particles using matrix decomposition methods, followed by a...
Conference Paper
The incorporation of matrix relation, which can encompass multidimensional similarities between local neighborhoods of points in the manifold, can improve kernel based data analysis. However, the utilization of multidimensional similarities results in a larger kernel and hence the computational cost of the corresponding spectral decomposition incre...
Article
Full-text available
Symmetry detection and analysis in 3D images is a fundamental task in a gamut of scientific fields such as computer vision, medical imaging and pattern recognition to name a few. In this work, we present a computational approach to 3D symmetry detection and analysis. Our analysis is conducted in the Fourier domain using the pseudo-polar Fourier tr...

Citations

... This method was introduced in the context of machine learning by Rabin and Coifman in [20], and is modeled after the classical Laplacian pyramids algorithm of Burt and Adelson [10], which is a standard technique in image processing. The LP extension algorithm has been considered in a variety of applications [14,11,24,1,18,12,2], and several variants have been proposed [15,21,22]. ...
... While several kernels are used in practice to construct the diffusion operator P, a standard choice is the Gaussian affinity k ε (x, y) = exp(− x − y 2 /ε) [13,[74][75][76], in which case we denote the diffusion operator P ε , where ε determines the neighborhood radius. This kernel choice is often seen in theoretical and mathematical work due to its established properties on data sampled from locally low dimensional geometries (i.e., data manifolds) [74,77,78]. ...
... The problem of multiple object tracking was primarily addressed by Raid (1979) where the current state is estimated from previous frames using Kalman filter. Later, particle filtering (also known as sequential Monte Carlo) was introduced, where a set of weighted particles sampled from a proposal distribution was maintained to represent the current and hidden states (Okuma et al., 2004;Shabat et al., 2015). This allows handling nonlinear multimodal distributions. ...
... Many more variants of the basic kernelized graph Laplacian have been developed, including the use of anisotropic kernel [34,40,8], landmark sets [6,26,31], and so on. These methods improve the statistical and computational efficiency of graph Laplacian methods in various scenarios, and may be used concurrently with bi-stochastic normalization. ...
... The high-density regions suggested that perhaps not every inhibition condition needed to have been measured to capture the geometry of the drug-inhibition state space. Applying a previously published sampling technique to the PhEMD drug-screen embedding 28 , we found that 34 landmark points could fully capture the EMT perturbation state space ( Supplementary Fig. 7); the phenotypes of the remaining experimental conditions could be inferred in relation to these 34 (Supplementary Note 8). This finding highlighted a potential opportunity for reducing the cost of future single-cell drug-screen experiments, as it suggested that only a small fraction (11%) of all inhibitors may need to be experimentally measured using expensive single-cell profiling techniques to learn the full range of perturbation effects. ...
... This can lead to an approximation error, for example, when the rank of G is smaller than the rank of A. Among the uses of fast randomized matrix decomposition algorithms, we find applications for tracking objects in videos [30], multiscale extensions for data [2] and detecting anomalies in network traffic for finding cyber attacks [8]. There are randomized versions for many different matrix factorizations [17] such as singular value decomposition (SVD), interpolative decomposition (ID) [7], pseudo-skeleton decomposition [16] (in which its randomized version is given by the CUR decomposition [12]), compressed sensing [11] and a randomized version for solving least squares problems [27]. ...
... Although a DM provides appealing analytic relation between spectral embedding with diffusion coordinates [37][38][39] , it often separates trajectories, pathways or clusters into independent eigenspaces. This, in turn, yields multidimensional representations that cannot be conveniently visualized (e.g., having substantially more than two or three dimensions) and cannot be directly projected into two-or three-dimensional displays that faithfully capture diffusion distances. ...
... Ideally, we would like to generalize the spectral embedding without such recomputation. Classical methods include Nyström extension [23,1,31] and its variants [4]. More recently, a neural network-based approach has been proposed in [25] to parametrize the eigenvectors of the Laplacian that automatically gives an out-of-sample extension. ...
... These aforementioned techniques provide good invariance theories, but they suffer from either computational complexity or accuracy problems [23] . Alternative methods such as Pseudo Polar Fourier Transform (PPFFT) [24,25] and USFFT (Unequally Spaced Fast Fourier transform) [26] have been developed to increase accuracy and decrease the computational complexity. The pseudo-polar fourier transform based on the definition of a polar-like 2D grid provides a faster fourier computation [24] . ...