## About

356

Publications

47,548

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

40,356

Citations

Citations since 2016

## Publications

Publications (356)

Cluster analysis by nonnegative low-rank approximations has experienced a remarkable progress in the past decade. However, the majority of such approximation approaches are still restricted to nonnegative matrix factorization (NMF) and su er from the following two drawbacks: 1) they are unable to produce balanced partitions for large-scale manifold...

In this work, we consider the Bayesian optimization (BO) approach for parametric tuning of complex chaotic systems. Such problems arise, for instance, in tuning the sub-grid-scale parameterizations in weather and climate models. For such problems, the tuning procedure is generally based on a performance metric which measures how well the tuned mode...

Data visualization is one of the major applications of nonlinear
dimensionality reduction. From the information retrieval perspective, the
quality of a visualization can be evaluated by considering the extent that the
neighborhood relation of each data point is maintained while the number of
unrelated points that are retrieved is minimized. This pr...

Affective classification and retrieval of multimedia such as audio, image, and video have become emerging research areas in recent years. The previous research focused on designing features and developing feature extraction methods. Generally, a multimedia content can be represented with different feature representations (i.e., views). However, the...

With an ever growing number of published scientific studies, there is a need for automated search methods, able to collect and extract as much information as possible from those articles. We propose a framework for the extraction and characterization of brain activity areas published in neuroscientific reports, as well as a suitable clustering stra...

In this work, we consider the Bayesian optimization (BO) approach for tuning parameters of complex chaotic systems. Such problems arise, for instance, in tuning the sub-grid scale parameterizations in weather and climate models. For such problems, the tuning procedure is generally based on a performance metric which measures how well the tuned mode...

Information divergence that measures the difference between two nonnegative
matrices or tensors has found its use in a variety of machine learning
problems. Examples are Nonnegative Matrix/Tensor Factorization, Stochastic
Neighbor Embedding, topic models, and Bayesian network optimization. The
success of such a learning task depends heavily on a su...

Many modern clustering methods employ a non-convex objective function and use iterative optimization algorithms to find local minima. Thus initialization of the algorithms is very important. Conventionally the starting guess of the iterations is randomly chosen; however, such a simple initialization often leads to poor clusterings. Here we propose...

Images usually convey information that can influence people’s emotional states. Such affective information can be used by search engines and social networks for better understanding the user’s preferences. We propose here a novel Bayesian multiple kernel learning method for predicting the emotions evoked by images. The proposed method can make use...

Emotional semantic image retrieval systems aim at incorporating the user’s affective states for responding adequately to the user’s interests. One challenge is to select features specific to image affect detection. Another challenge is to build effective learning models or classifiers to bridge the so-called “affective gap”. In this work, we study...

Stochastic matrices are arrays whose elements are discrete probabilities. They are widely used in techniques such as Markov Chains, probabilistic latent semantic analysis, etc. In such learning problems, the learned matrices, being stochastic matrices, are non-negative and all or part of the elements sum up to one. Conventional multiplicative updat...

Projective Nonnegative Matrix Factorization (PNMF) is able to extract sparse features and provide good approximation for discrete problems such as clustering. However, the original PNMF optimization algorithm can not guarantee theoretical convergence during the iterative learning. We propose here an adaptive multiplicative algorithm for PNMF which...

Projective Nonnegative Matrix Factorization (PNMF) is one of the recent methods for computing low-rank approximations to data matrices. It is advantageous in many practical application domains such as clustering, graph partitioning, and sparse feature extraction. However, up to now a scalable implementation of PNMF for large-scale machine learning...

Nonnegative Matrix Factorization (NMF) based on the family of β-divergences has shown to be advantageous in several signal processing and data analysis tasks. However, how to automatically select the best divergence among the family for given data remains unknown. Here we propose a new estimation criterion to resolve the problem of selecting β. Our...

In the past decade, Probabilistic Latent Semantic Indexing (PLSI) has become an important modeling technique, widely used in clustering or graph partitioning analysis. However, the original PLSI is designed for multinomial data and may not handle other data types. To overcome this restriction, we generalize PLSI to t-exponential family based on a r...

Independent Subspace Analysis (ISA) consists in separating sets (subspaces) of dependent sources, with different sets being independent of each other. While a few algorithms have been proposed to solve this problem, they are all completely general in the sense that they do not make any assumptions on the intra-subspace dependency. In this paper, we...

Independent component analysis (ICA) is possibly the most widespread approach to solve the blind source separation problem. Many different algorithms have been proposed, together with several highly successful applications. There is also an extensive body of work on the theoretical foundations and limits of the ICA methodology.One practical concern...

Clustering analysis by nonnegative low-rank approximations has achieved
remarkable progress in the past decade. However, most approximation approaches
in this direction are still restricted to matrix factorization. We propose a
new low-rank learning method to improve the clustering performance, which is
beyond matrix factorization. The approximatio...

In Nonnegative Matrix Factorization (NMF), a nonnegative matrix is approximated by a product of lower-rank factorizing matrices. Most NMF methods assume that each factorizing matrix appears only once in the approximation, thus the approximation is linear in the factorizing matrices. We present a new class of approximative NMF methods, called Quadra...

Many dynamical models, such as numerical weather prediction and climate models, contain so called clo-sure parameters. These parameters usually appear in physical parameterizations of sub-grid scale processes, and they act as "tuning handles" of the models. Currently, the values of these parameters are specified mostly manually, but the increasing...

Nonnegative Matrix Factorization (NMF) is a promising relaxation technique for clustering analysis. However, conventional NMF methods that directly approximate the pairwise similarities using the least square error often yield mediocre performance for data in curved manifolds because they can capture only the immediate similarities between data sam...

Multiplicative updates have been widely used in approximative nonnegative matrix factorization (NMF) optimization because they are convenient to deploy. Their convergence proof is usually based on the minimization of an auxiliary upper-bounding function, the construction of which however remains specific and only available for limited types of diss...

In document clustering, semantically similar documents are grouped together. The dimensionality of document collections is often very large, thousands or tens of thousands of terms. Thus, it is common to reduce the original dimensionality before clustering for computational reasons. Cosine distance is widely seen as the best choice for measuring th...

Explicit relevance feedback requires the user to explicitly refine the search queries for content-based image retrieval. This may become laborious or even impossible due to the ever-increasing volume of digital databases. We present a multimodal information collector that can unobtrusively record and asynchronously transmit the user’s implicit rele...

The I-divergence or unnormalized generalization of Kullback-Leibler (KL) divergence is commonly used in Nonnegative Matrix
Factorization (NMF). This divergence has the drawback that its gradients with respect to the factorizing matrices depend heavily
on the scales of the matrices, and learning the scales in gradient-descent optimization may requir...

What is Blind and Semi-blind Source Separation? Blind source separation (BSS) is a class of computational data analysis techniques for revealing hidden factors, that underlie sets of measurements or signals. BSS assumes a statistical model whereby the

The well-known Nonnegative Matrix Factorization (NMF) method can be provided with more flexibility by generalizing the non-normalized Kullback-Leibler divergence to α-divergences. However, the resulting α-NMF method can only achieve mediocre sparsity for the factoriz-ing matrices. We have earlier proposed a variant of NMF, called Projective NMF (PN...

Uncertainties in future climate projections are often evaluated based on the perceived spread of ensembles of multi-model climate projections, such as those generated in different phases of the Coupled Model Inter-comparison Project. In this paper we concentrate on uncertainties of a single climate model and the propagation of these uncertainties i...

Climate models contain closure parameters to which the model climate is sensitive. These parameters appear in physical parameterization schemes where some unresolved variables are expressed by predefined parameters rather than being explicitly modeled. Currently, best expert knowledge is used to define the optimal closure parameter values, based on...

Projective Nonnegative Matrix Factorization (PNMF) has demonstrated advantages in both sparse feature extraction and clustering.
However, PNMF requires users to specify the column rank of the approximative projection matrix, the value of which is unknown
beforehand. In this paper, we propose a method called ARDPNMF to automatically determine the co...

Nonnegativity has been shown to be a powerful principle in linear matrix decompositions, leading to sparse component matrices
in feature analysis and data compression. The classical method is Lee and Seung’s Nonnegative Matrix Factorization. A standard
way to form learning rules is by multiplicative updates, maintaining nonnegativity. Here, a gener...

We introduce a probabilistic version of the self-organizing map (SOM) where we model the uncertainty of both the model vectors and the data. While uncertainty information about the data is often not available, this property becomes very useful when the method is combined in a hierarchical manner with probabilistic principal component analysis (PCA)...

It has been demonstrated that Student t-Distributed Stochastic Neighbor Embedding (t-SNE) can enhance discovery of clusters of data. However, the original t-SNE implementation employs an additive gradient-based algorithm which requires suitable learning step size and momentum rate, the tuning of which can be laborious. We propose a novel fixed-poin...

Climate models contain closure parameters to which the model climate is sensitive. These parameters appear in physical parameterization schemes where some unresolved variables are expressed by predefined parameters rather than being explicitly modeled. Currently, best expert knowledge is used to define the optimal closure parameter values, based on...

A variant of nonnegative matrix factorization (NMF) which was proposed earlier is analyzed here. It is called projective nonnegative matrix factorization (PNMF). The new method approximately factorizes a projection matrix, minimizing the reconstruction error, into a positive low-rank matrix and its transpose. The dissimilarity between the original...

This paper presents an approach that allows for performing regression on large data sets in reasonable time. The main component of the approach consists in speeding up the slowest operation of the used al- gorithm by running it on the Graphics Processing Unit (GPU) of the video card, instead of the processor (CPU). The experiments show a speedup of...

A new matrix factorization algorithm which combines two recently proposed nonnegative learning techniques is presented. Our
new algorithm, α-PNMF, inherits the advantages of Projective Nonnegative Matrix Factorization (PNMF) for learning a highly orthogonal factor
matrix. When the Kullback-Leibler (KL) divergence is generalized to α-divergence, it...

In time series prediction, one does often not know the properties of the underlying system generating the time series. For example, is it a closed system that is generating the time series or are there any external factors influencing the system? As a result of this, you often do not know beforehand whether a time series is stationary or nonstation...

Stochastic Neighbor Embedding (SNE) has shown to be quite promising for data visualization. Currently, the most popular implementation, t-SNE, is restricted to a particular Student t-distribution as its embedding distribution. Moreover, it uses a gradient descent algorithm that may require users to tune parameters such as the learning step size, mo...

The derivation of the Cramer-Rao bound (CRB) in [ldquoPerformance Analysis of the FastICA Algorithm and Cramer-Rao Bounds for Linear Independent Component Analysis,rdquo IEEE Trans. Signal Process., vol. 54, no. 4, Apr. 2006, pp. 1189-1203] contains errors, which influence the matrix form of the CRB but not the CRB on variance of relevant off-diago...

We give a general overview of the use and possible misuse of blind source separation (BSS) and independent component analysis (ICA) in the context of neuroinformatics data processing. A clear emphasis is given to the analysis of electrophysiological recordings, as well as to functional magnetic resonance images (fMRI). Two illustrative examples inc...

This paper presents the CATS Benchmark and the results of the competition organised during the IJCNN'04 conference in Budapest. Twenty-four papers and predictions have been submitted and seventeen have been selected. The goal of the competition was the prediction of 100 missing values divided into five groups of twenty consecutive values.

The Publisher regrets that this article is an accidental duplication of an article that has already been published in Neurocomputing, 70(2007), 2325–2329, doi:10.1016/j.neucom.2007.02.013. The duplicate article has therefore been withdrawn.

We propose here new variants of the Non-negative Matrix Factorization (NMF) method for learning spatially localized, sparse, part-based subspace representations of visual or other patterns. The algorithms are based on positively constrained projections and are related both to NMF and to the conventional SVD or PCA decomposition. A crucial question...

Many linear ICA techniques are based on minimizing a nonlinear contrast function and many of them use a hyperbolic tangent (tanh) as their built-in nonlinearity. In this paper we propose two rational functions to replace the tanh and other popular functions that are tailored for separating supergaussian (long-tailed) sources. The advantage of the r...

The fast independent component analysis (FastICA) algorithm is one of the most popular methods to solve problems in ICA and blind source separation. It has been shown experimentally that it outperforms most of the commonly used ICA algorithms in convergence speed. A rigorous local convergence analysis has been presented only for the so-called one-u...

FastICA is one of the most popular algorithms for independent component analysis (ICA), demixing a set of statistically independent sources that have been mixed linearly. A key question is how accurate the method is for finite data samples. We propose an improved version of the FastICA algorithm which is asymptotically efficient, i.e., its accuracy...

In this paper, we enhance and analyze the Evolving Tree (ETree) data analysis algorithm. The suggested improvements aim to make the system perform better while still maintaining the simple nature of the basic algorithm. We also examine the system's behavior with many different kinds of tests, measurements and visualizations. We compare the ETree's...

The FastICA or fixed-point algorithm is one of the most successful algorithms for linear independent component analysis (ICA) in terms of accuracy and computational complexity. Two versions of the algorithm are available in literature and software: a one-unit (deflation) algorithm and a symmetric algorithm. The main result of this paper are analyti...

We introduce a neural network for the analysis of local independent components of an input signal. The network is a modification of Kohonen's adaptive-subspace self-organizing map. The map units consist of weight matrices adapted to represent linear transformations which locally minimize statistical dependence among pattern vector components. Train...

A new and efficient version of the Hough Transform for curve detection, the Randomized Hough Transform (RHT), has been recently suggested. The RHT selects n pixels from an edge image by random sampling to solve n parameters of a curve and then accumulates only one cell in a parameter space. In this paper, the RHT is related to other recent developm...

We present an example of exploratory data analysis of climate measurements using a recently developed denoising source separation (DSS) framework. We analyzed a combined dataset containing daily measurements of three variables: surface temperature, sea level pressure and precipitation around the globe, for a period of 56 years. Components exhibitin...

The FastICA algorithm is a popular procedure for indepen- dent component analysis and blind source separation. In this paper, we analyze the average convergence behavior of the single-unit FastICA al- gorithm with kurtosis contrast for general m-source noiseless mixtures. We prove that this algorithm causes the average inter-channel interfer- ence...

In this paper, we tested the efficiency of a two-step blind source separation (BSS) approach for the extraction of independent
sources of α-activity from ongoing electroencephalograms (EEG). The method starts with a denoising source separation (DSS) of the recordings,
and is followed by either an independent component analysis (ICA) or a temporal d...

We propose using Independent Component Analysis (ICA) as an advanced pre-processing tool for blind suppression of interfering jammer signals in direct sequence spread spectrum communication systems utilizing antenna arrays. The role of ICA is to provide a jammer-mitigated signal to the conventional detection. If the jammer signal is weak or absent,...

We present a method for exploratory data analysis of large spatiotemporal data sets such as global longtime climate measurements, extending our previous work on semiblind source separation of climate data. The method seeks fast changing components whose variances exhibit slow behavior with specific temporal structure. The algorithm is developed in...

I. INTRODUCTION Information in its every form is becoming more and more important in the world of today. Modern computer systems can store huge amounts of data, and new data are acquired at an ever increasing rate. In a recent study [1] it was estimated that we collectively produced around 5 exabytes (5×10 18 bytes) of new information in the year 2...

This book includes the proceedings of the International Conference on Artificial Neural Networks (ICANN 2006) held on September 10-14, 2006 in Athens, Greece, with tutorials being presented on September 10, the main conference taking place during September 11-13 and accompanying workshops on perception, cognition and interaction held on September 1...

In many fields of science, engineering, medicine and economics, large or huge data sets are routinely collected. Processing and transforming such data to intelligible form for the human user is becoming one of the most urgent problems in near future. Neural networks and related statistical machine learning methods have turned out to be promising so...

The fixed point algorithm, known as FastICA, is one of the most successful algorithms for independent component analysis in terms of accuracy and low computational complexity. This paper derives analytic closed form expressions that characterize separating ability of both one-unit and symmetric version of the algorithm in a local sense. Based on th...

In image compression and feature extraction, linear expan- sions are standardly used. It was recently pointed out by Lee and Seung that the positivity or non-negativity of a linear expansion is a very power- ful constraint, that seems to lead to sparse representations for the images. Their technique, called Non-negative Matrix Factorization (NMF),...

This paper derives a closed-form expression for the Cramer-Rao bound (CRB) on estimating the source signals in the linear independent component analysis problem, assuming that all independent components have finite variance. It is also shown that the fixed-point algorithm known as FastICA can approach the CRB (the estimate can be nearly efficient)...