Publications (29)37.69 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: Recently, a family of online kernellearning algorithms, known as the kernel adaptive filtering (KAF) algorithms, has become an emerging area of research. The KAF algorithms are developed in reproducing kernel Hilbert spaces (RKHS), by using the linear structure of this space to implement wellestablished linear adaptive algorithms and to obtain nonlinear filters in the original input space. These algorithms include the kernel least mean squares (KLMS), kernel affine projection algorithms (KAPA), kernel recursive least squares (KRLS), and extended kernel recursive least squares (EXKRLS), etc. When the kernels are radial (such as the Gaussian kernel), they naturally build a growing RBF network, where the weights are directly related to the errors in each sample. The aim of this chapter is to give a brief introduction to kernel adaptive filters. In particular, our focus is on KLMS, the simplest KAF algorithm, which is easy to implement, yet efficient. Several key aspects of the algorithm are discussed, such as selfregularization, sparsification, quantization, and the meansquare convergence. Application examples are also presented, including in particular the adaptive neural decoder for spike trains. 
Article: On a PCAbased lung motion model
[Show abstract] [Hide abstract]
ABSTRACT: Respirationinduced organ motion is one of the major uncertainties in lung cancer radiotherapy and is crucial to be able to accurately model the lung motion. Most work so far has focused on the study of the motion of a single point (usually the tumor center of mass), and much less work has been done to model the motion of the entire lung. Inspired by the work of Zhang et al (2007 Med. Phys. 34 477281), we believe that the spatiotemporal relationship of the entire lung motion can be accurately modeled based on principle component analysis (PCA) and then a sparse subset of the entire lung, such as an implanted marker, can be used to drive the motion of the entire lung (including the tumor). The goal of this work is twofold. First, we aim to understand the underlying reason why PCA is effective for modeling lung motion and find the optimal number of PCA coefficients for accurate lung motion modeling. We attempt to address the above important problems both in a theoretical framework and in the context of real clinical data. Second, we propose a new method to derive the entire lung motion using a single internal marker based on the PCA model. The main results of this work are as follows. We derived an important property which reveals the implicit regularization imposed by the PCA model. We then studied the model using two mathematical respiratory phantoms and 11 clinical 4DCT scans for eight lung cancer patients. For the mathematical phantoms with cosine and an even power (2n) of cosine motion, we proved that 2 and 2n PCA coefficients and eigenvectors will completely represent the lung motion, respectively. Moreover, for the cosine phantom, we derived the equivalence conditions for the PCA motion model and the physiological 5D lung motion model (Low et al 2005 Int. J. Radiat. Oncol. Biol. Phys. 63 9219). For the clinical 4DCT data, we demonstrated the modeling power and generalization performance of the PCA model. The average 3D modeling error using PCA was within 1 mm (0.7 ± 0.1 mm). When a single artificial internal marker was used to derive the lung motion, the average 3D error was found to be within 2 mm (1.8 ± 0.3 mm) through comprehensive statistical analysis. The optimal number of PCA coefficients needs to be determined on a patientbypatient basis and two PCA coefficients seem to be sufficient for accurate modeling of the lung motion for most patients. In conclusion, we have presented thorough theoretical analysis and clinical validation of the PCA lung motion model. The feasibility of deriving the entire lung motion using a single marker has also been demonstrated on clinical data using a simulation approach.  [Show abstract] [Hide abstract]
ABSTRACT: Similarity is a key concept to quantify temporal signals or static measurements. Similarity is difficult to define mathematically, however, one never really thinks too much about this difficulty and naturally translates similarity by correlation. This is one more example of how engrained secondorder moment descriptors of the probability density function really are in scientific thinking. Successful engineering or pattern recognition solutions from these methodologies rely heavily on the Gaussianity and linearity assumptions, exactly for the same reasons discussed in Chapter 3. 
Conference Paper: Fixedbudget kernel recursive leastsquares
[Show abstract] [Hide abstract]
ABSTRACT: We present a kernelbased recursive leastsquares (KRLS) algorithm on a fixed memory budget, capable of recursively learning a nonlinear mapping and tracking changes over time. In order to deal with the growing support inherent to online kernel methods, the proposed method uses a combined strategy of growing and pruning the support. In contrast to a previous slidingwindow based technique, the presented algorithm does not prune the oldest data point in every time instant but it instead aims to prune the least significant data point. We also introduce a label update procedure to equip the algorithm with tracking capability. Simulations show that the proposed method obtains better performance than stateoftheart kernel adaptive filtering techniques given similar memory requirements. 
Chapter: Background and Preview
 [Show abstract] [Hide abstract]
ABSTRACT: Definition of Surprise A Review of Gaussian Process Regression Computing Surprise Kernel Recursive Least Squares with Surprise Criterion Kernel Least Mean Square with Surprise Criterion Kernel Affine Projection Algorithms with Surprise Criterion Computer Experiments Conclusion Endnotes  [Show abstract] [Hide abstract]
ABSTRACT: Singular Value Decomposition PositiveDefinite Matrix Eigenvalue Decomposition Schur Complement Block Matrix Inverse Matrix Inversion Lemma Joint, Marginal, and Conditional Probability Normal Distribution Gradient Descent Newton's Method  [Show abstract] [Hide abstract]
ABSTRACT: LeastMeanSquare Algorithm Kernel LeastMeanSquare Algorithm Kernel and Parameter Selection StepSize Parameter Novelty Criterion SelfRegularization Property of KLMS Leaky Kernel LeastMeanSquare Algorithm Normalized Kernel LeastMeanSquare Algorithm Kernel ADALINE Resource Allocating Networks Computer Experiments Conclusion Endnotes  [Show abstract] [Hide abstract]
ABSTRACT: Extended Recursive Least Squares Algorithm Exponentially Weighted Extended Recursive Least Squares Algorithm Extended Kernel Recursive Least Squares Algorithm EXKRLS for Tracking Models EXKRLS with Finite Rank Assumption Computer Experiments Conclusion Endnotes  [Show abstract] [Hide abstract]
ABSTRACT: Recursive LeastSquares Algorithm Exponentially Weighted Recursive LeastSquares Algorithm Kernel Recursive LeastSquares Algorithm Approximate Linear Dependency Exponentially Weighted Kernel Recursive LeastSquares Algorithm Gaussian Processes for Linear Regression Gaussian Processes for Nonlinear Regression Bayesian Model Selection Computer Experiments Conclusion Endnotes  [Show abstract] [Hide abstract]
ABSTRACT: HalfTitle Page Wiley Series Page Title Page Copyright Page Dedication Page Table of Contents Preface Acknowledgements Abbreviations and Symbols Notation  [Show abstract] [Hide abstract]
ABSTRACT: Online learning from a signal processing perspective. There is increased interest in kernel learning algorithms in neural networks and a growing need for nonlinear adaptive algorithms in advanced signal processing, communications, and controls. Kernel Adaptive Filtering is the first book to present a comprehensive, unifying introduction to online learning algorithms in reproducing kernel Hilbert spaces. Based on research being conducted in the Computational NeuroEngineering Laboratory at the University of Florida and in the Cognitive Systems Laboratory at McMaster University, Ontario, Canada, this unique resource elevates the adaptive filtering theory to a new level, presenting a new design methodology of nonlinear adaptive filters. Covers the kernel least mean squares algorithm, kernel affine projection algorithms, the kernel recursive least squares algorithm, the theory of Gaussian process regression, and the extended kernel recursive least squares algorithm. Presents a powerful modelselection method called maximum marginal likelihood. Addresses the principal bottleneck of kernel adaptive filterstheir growing structure. Features twelve computeroriented experiments to reinforce the concepts, with MATLAB codes downloadable from the authors' Web site. Concludes each chapter with a summary of the state of the art and potential future directions for original research. Kernel Adaptive Filtering is ideal for engineers, computer scientists, and graduate students interested in nonlinear adaptive systems for online applications (applications where the data stream arrives one sample at a time and incremental optimal solutions are desirable). It is also a useful guide for those who look for nonlinear adaptive filtering methodologies to solve practical problems.  [Show abstract] [Hide abstract]
ABSTRACT: The previous chapter defined crosscorrentropy for the case of a pair of scalar random variables, and presented applications in statistical inference. This chapter extends the definition of correntropy for the case of random (or stochastic) processes, which are index sets of random variables. In statistical signal processing the index set is time; we are interested in random variables that are a function of time and the goal is to quantify their statistical dependencies (although the index set can also be defined over inputs or channels of multivariate random variables). The autocorrelation function, which measures the statistical dependency between random variables at two different times, is conventionally utilized for this goal. Hence, we generalize the definition of autocorrelation to an autocorrentropy function. The name correntropywas coined to reflect the fact that the function “looks like” correlation but the sum over the lags (or over dimensions of the multivariate random variable) is the information potential (i.e., the argument of Renyi’s quadratic entropy). The definition of crosscorrentropy for random variables carries over to time series with a minor but important change in the domain of the variables that now are an index set of lags. When it is clear from the context, we simplify the terminology and refer to the different functions (autocorrentropy, or crosscorrentropy) simply as correntropy function, but keep the word “function” to distinguish them from Chapter 10 quantities.  [Show abstract] [Hide abstract]
ABSTRACT: This paper discusses an information theoretic approach of designing sparse kernel adaptive filters. To determine useful data to be learned and remove redundant ones, a subjective information measure called surprise is introduced. Surprise captures the amount of information a datum contains which is transferable to a learning system. Based on this concept, we propose a systematic sparsification scheme, which can drastically reduce the time and space complexity without harming the performance of kernel adaptive filters. Nonlinear regression, short term chaotic timeseries prediction, and long term timeseries forecasting examples are presented.  [Show abstract] [Hide abstract]
ABSTRACT: This paper presents a kernelized version of the extended recursive least squares (EXKRLS) algorithm which implements for the first time a general linear state model in reproducing kernel Hilbert spaces (RKHS), or equivalently a general nonlinear state model in the input space. The center piece of this development is a reformulation of the well known extended recursive least squares (EXRLS) algorithm in RKHS which only requires inner product operations between input vectors, thus enabling the application of the kernel property (commonly known as the kernel trick). The first part of the paper presents a set of theorems that shows the generality of the approach. The EXKRLS is preferable to 1) a standard kernel recursive least squares (KRLS) in applications that require tracking the statevector of general linear statespace models in the kernel space, or 2) an EXRLS when the application requires a nonlinear observation and state models. The second part of the paper compares the EXKRLS in nonlinear Rayleigh multipath channel tracking and in Lorenz system modeling problem. We show that the proposed algorithm is able to outperform the standard KRLS and EXRLS in both simulations.  [Show abstract] [Hide abstract]
ABSTRACT: This paper demonstrates the effectiveness of a nonlinear extension to the matched filter for signal detection in certain kinds of nonGaussian noise. The decision statistic is based on a new measure of similarity that can be considered as an extension of the correlation statistic used in the matched filter. The optimality of the matched filter is predicated on second order statistics and hence leaves room for improvement, especially when the assumption of Gaussianity is not applicable. The proposed method incorporates higher order moments in the decision statistic and shows an improvement in the receiver operating characteristics (ROC) for nonGaussian noise, in particular, those that are impulsive distributed. The performance of the proposed method is demonstrated for detection in two types of widely used impulsive noise models, the alphastable model and the twoterm Gaussian mixture model. Moreover, unlike other kernel based approaches, and those using the characteristic functions directly, this method is still computationally tractable and can easily be implemented in realtime. 
Article: The correntropy MACE filter
[Show abstract] [Hide abstract]
ABSTRACT: The minimum average correlation energy (MACE) filter is well known for object recognition. This paper proposes a nonlinear extension to the MACE filter using the recently introduced correntropy function. Correntropy is a positive definite function that generalizes the concept of correlation by utilizing second and higher order moments of the signal statistics. Because of its positive definite nature, correntropy induces a new reproducing kernel Hilbert space (RKHS). Taking advantage of the linear structure of the RKHS it is possible to formulate the MACE filter equations in the RKHS induced by correntropy and obtained an approximate solution. Due to the nonlinear relation between the feature space and the input space, the correntropy MACE (CMACE) can potentially improve upon the MACE performance while preserving the shiftinvariant property (additional computation for all shifts will be required in the CMACE). To alleviate the computation complexity of the solution, this paper also presents the fast CMACE using the fast Gauss transform (FGT). We apply the CMACE filter to the MSTAR public release synthetic aperture radar (SAR) data set as well as PIE database of human faces and show that the proposed method exhibits better distortion tolerance and outperforms the linear MACE in both generalization and rejection abilities.  [Show abstract] [Hide abstract]
ABSTRACT: The linear least mean squares (LMS) algorithm has been recently extended to a reproducing kernel Hilbert space, resulting in an adaptive filter built from a weighted sum of kernel functions evaluated at each incoming data sample. With time, the size of the filter as well as the computation and memory requirements increase. In this paper, we shall propose a new efficient methodology for constraining the increase in length of a radial basis function (RBF) network resulting from the kernel LMS algorithm without significant sacrifice on performance. The method involves sequential Gaussian elimination steps on the Gram matrix to test the linear dependency of the feature vector corresponding to each new input with all the previous feature vectors. This gives an efficient method of continuing the learning as well as restricting the number of kernel functions used. 

Conference Paper: The Wellposedness Analysis of the Kernel Adaline
[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we investigate the wellposedness of the kernel adaline. The kernel adaline finds the linear coefficients in a radial basis function network using deterministic gradient descent. We will show that the gradient descent provides an inherent regularization as long as the training is properly earlystopped. Along with other popular regularization techniques, this result is investigated in a unifying regularizationfunction concept. This understanding provides an alternative and possibly simpler way to obtain regularized solutions comparing with the crossvalidation approach in regularization networks.
Publication Stats
704  Citations  
37.69  Total Impact Points  
Top Journals
Institutions

20062010

University of Florida
 • Department of Biomedical Engineering
 • Department of Electrical and Computer Engineering
Gainesville, FL, United States


2009

Amazon.com
Seattle, Washington, United States
