Article

Kernel Method for Pattern Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Kernels (Shawe-Taylor and Cristianini, 2004) are a convenient approach to accommodate nonlinearity and to work with high-dimensional, complex features, such as parameters from a model of brain dynamics. In general, kernels are similarity functions, and they can be used straightforwardly in a prediction algorithm. ...
... While feature matrices can be very high dimensional, a kernel is represented by a (no. of subjects by no. of subjects) matrix. Kernel methods can readily be adapted to deal with nonlinear decision boundaries in prediction, by projecting the data into a high-dimensional (possibly infinite-dimensional) space through an embedding x → ϕx ; then, by estimating a linear separating hyperplane on this space, we can effectively have a nonlinear estimator on the original space (Shawe-Taylor and Cristianini, 2004). In practice, instead of working explicitly in a higher-dimensional embedding space, the so-called kernel trick uses a kernel function κ(n, m) containing the similarity between data points n and m (here, subjects) in the higher-dimensional embedding space (Schölkopf et al., 2002;Shawe-Taylor and Cristianini, 2004), which can be simpler to calculate. ...
... Kernel methods can readily be adapted to deal with nonlinear decision boundaries in prediction, by projecting the data into a high-dimensional (possibly infinite-dimensional) space through an embedding x → ϕx ; then, by estimating a linear separating hyperplane on this space, we can effectively have a nonlinear estimator on the original space (Shawe-Taylor and Cristianini, 2004). In practice, instead of working explicitly in a higher-dimensional embedding space, the so-called kernel trick uses a kernel function κ(n, m) containing the similarity between data points n and m (here, subjects) in the higher-dimensional embedding space (Schölkopf et al., 2002;Shawe-Taylor and Cristianini, 2004), which can be simpler to calculate. Once κ(·, ·) is computed for each pair of subjects, this is all that is needed for the prediction. ...
Article
Full-text available
Predicting an individual’s cognitive traits or clinical condition using brain signals is a central goal in modern neuroscience. This is commonly done using either structural aspects, such as structural connectivity or cortical thickness, or aggregated measures of brain activity that average over time. But these approaches are missing a central aspect of brain function: the unique ways in which an individual’s brain activity unfolds over time. One reason why these dynamic patterns are not usually considered is that they have to be described by complex, high-dimensional models; and it is unclear how best to use these models for prediction. We here propose an approach that describes dynamic functional connectivity and amplitude patterns using a Hidden Markov model (HMM) and combines it with the Fisher kernel, which can be used to predict individual traits. The Fisher kernel is constructed from the HMM in a mathematically principled manner, thereby preserving the structure of the underlying model. We show here, in fMRI data, that the HMM-Fisher kernel approach is accurate and reliable. We compare the Fisher kernel to other prediction methods, both time-varying and time-averaged functional connectivity-based models. Our approach leverages information about an individual’s time-varying amplitude and functional connectivity for prediction and has broad applications in cognitive neuroscience and personalised medicine.
... Intuitively, a kernel can be understood as a two-place function that measures the similarity between two objects. In a slightly more formal definition, we could say that a given real symmetric two-place function is a kernel iff, for every finite set of objects, it generates a real-valued matrix that is squared, symmetric and positive semi-definite [Shawe-Taylor, 2004]. This kind of matrix is called "kernel matrix" and it stores the pairwise comparisons performed by the kernel function. ...
Preprint
Full-text available
A definitive cure for HIV/AIDS does not exist yet and, thus, patients rely in antiretroviral therapy for life. In this scenario, the emergence of drug resistance is an important concern. The automatic prediction of resistance from HIV sequences is a fast tool for physicians to choose the best possible medical treatment. This paper proposes three kernel functions to deal with this data: one focused on single residue mutations, another on k -mers (close-range information in sequence), and another on pairwise interactions between amino acids (close and long-range information). Furthermore, the three kernels are able to deal with the categorical nature of HIV data and the presence of allelic mixtures. The experiments on the PI dataset from the Stanford Genotype-Phenotype database show that they generate prediction models with a very good performance, while remaining simple, open and interpretable. Most of the mutations and patterns they consider relevant are in agreement with previous literature. Also, this paper compares the different but complementary view that two kernel methods (SVM and kernel PCA) give over HIV data, showing that the former is focused on optimizing prediction while the latter summarizes the main patterns of genetic diversity, which in the Stanford Genotype-Phenotype database are related to drug resistance and HIV subtype.
... Multivariate adaptive regression splines introduce sparsity, but enrichment analysis performs better with a dense input. We can estimate the conditional expectations of Φ using any general non-linear regression method, so we instead estimated the expectations using kernel ridge regression equipped with a radial basis function kernel (Shawe-Taylor and Cristianini, 2004). We then computed the D-RCS across all patients for each variable in X . ...
Article
Full-text available
Root causal gene expression levels – or root causal genes for short – correspond to the initial changes to gene expression that generate patient symptoms as a downstream effect. Identifying root causal genes is critical towards developing treatments that modify disease near its onset, but no existing algorithms attempt to identify root causal genes from data. RNA-sequencing (RNA-seq) data introduces challenges such as measurement error, high dimensionality and non-linearity that compromise accurate estimation of root causal effects even with state-of-the-art approaches. We therefore instead leverage Perturb-seq, or high-throughput perturbations with single-cell RNA-seq readout, to learn the causal order between the genes. We then transfer the causal order to bulk RNA-seq and identify root causal genes specific to a given patient for the first time using a novel statistic. Experiments demonstrate large improvements in performance. Applications to macular degeneration and multiple sclerosis also reveal root causal genes that lie on known pathogenic pathways, delineate patient subgroups and implicate a newly defined omnigenic root causal model.
... Unlike traditional fuzzy rough sets, which rely on fixed neighborhood relationships, KFRS leverages kernel functions to dynamically adapt neighborhood boundaries based on feature correlations. Kernel methods are a class of machine learning algorithms and have been extensively applied across a wide range of problems [62][63][64][65]. This flexibility enables KFRS to address the inherent uncertainty and heterogeneity in multi-label datasets, particularly those with mixed numerical and categorical attributes. ...
Article
Full-text available
Multi-label learning, which involves assigning multiple class labels to each instance, becomes increasingly complex when dealing with large-scale mixed datasets featuring high-dimensional feature spaces. These mixed datasets often involve a combination of numerical and categorical features, which exacerbate the challenges of multi-label learning by introducing additional layers of uncertainty and variability. Traditional classification methods, although effective in simpler scenarios, often fail to address these complexities resulting in significant errors. To overcome this, we have developed an entropy-based objective function that captures the intricate interplay between features and classes, while accounting for the inherent uncertainty of mixed data. This objective function explicitly accounts for the heterogeneous nature of mixed datasets, ensuring robust feature selection across diverse attribute types. To tackle these challenges, we propose a memetic algorithm that integrates fuzzy rough sets with enhancements from kernel fuzzy rough sets (KFRS), and the Non-dominated Sorting Genetic Algorithm II. This synergy enables the extraction of optimal feature subsets that significantly improve classification performance. By leveraging kernel-based similarity measures, KFRS refines the partitions formed by fuzzy set memberships for distinct classes, ensuring precise alignment of data samples with multiple labels, while effectively handling the complexities of mixed-data representation. A key strength of our approach lies in its ability to preserve valuable information through KFRS-driven feature selection. Empirical evaluations on three benchmark datasets highlight the effectiveness of the proposed methodology. The results validate the superiority of our feature selection strategy, grounded in kernel-modulated neighborhoods; furthermore, the implementation demonstrates a notable improvement in both solution quality and search efficiency, establishing it as a highly promising method for multi-label learning tasks.
... M is the size of the labelled data set and B is a function of the specific kernel and the loss function from equation 4 [39]. Such approaches are often defined in terms of VC dimension or fat-shattering dimension [40], however, these bounds all represent worst case scenarios and have limited practical relevance. ...
Article
Full-text available
The popular qubit framework has dominated recent work on quantum kernel machine learning, with results characterising expressivity, learnability and generalisation. As yet, there is no comparative framework to understand these concepts for continuous variable (CV) quantum computing platforms. In this paper we represent CV quantum kernels as closed form functions and use this representation to provide several important theoretical insights. We derive a general closed form solution for all CV quantum kernels and show every such kernel can be expressed as the product of a Gaussian and an algebraic function of the parameters of the feature map. Furthermore, in the multi-mode case, we present quantification of a quantum-classical separation for all quantum kernels via a hierarchical notion of the “stellar rank" of the quantum kernel feature map. We then prove kernels defined by feature maps of infinite stellar rank, such as GKP-state encodings, can be approximated arbitrarily well by kernels defined by feature maps of finite stellar rank. Finally, we simulate learning with a single-mode displaced Fock state encoding and show that (i) accuracy on our specific task (an annular data set) increases with stellar rank, (ii) for underfit models, accuracy can be improved by increasing a bandwidth hyperparameter, and (iii) for noisy data that is overfit, decreasing the bandwidth will improve generalisation but does so at the cost of effective stellar rank.
... With smaller value for the parameter σ, the kernel matrix becomes closer to identity matrix while risking overfitting. On the other hand, larger values of parameter gradually reduce the kernel to a constant function, making it impossible to learn any non-trivial classifier [28]. In our experiment, we use a separate validation data sets consisting of 25% of total samples to determine the parameters of the kernels, such as the value of σ in RBF kernels. ...
Preprint
The medical research facilitates to acquire a diverse type of data from the same individual for particular cancer. Recent studies show that utilizing such diverse data results in more accurate predictions. The major challenge faced is how to utilize such diverse data sets in an effective way. In this paper, we introduce a multiple kernel based pipeline for integrative analysis of high-throughput molecular data (somatic mutation, copy number alteration, DNA methylation and mRNA) and clinical data. We apply the pipeline on Ovarian cancer data from TCGA. After multiple kernels have been generated from the weighted sum of individual kernels, it is used to stratify patients and predict clinical outcomes. We examine the survival time, vital status, and neoplasm cancer status of each subtype to verify how well they cluster. We have also examined the power of molecular and clinical data in predicting dichotomized overall survival data and to classify the tumor grade for the cancer samples. It was observed that the integration of various data types yields higher log-rank statistics value. We were also able to predict clinical status with higher accuracy as compared to using individual data types.
Article
Full-text available
The classical hinge-loss support vector machines (SVMs) model is sensitive to outlier observations due to the unboundedness of its loss function. To circumvent this issue, recent studies have focused on non-convex loss functions, such as the hard-margin loss, which associates a constant penalty to any misclassified or within-margin sample. Applying this loss function yields much-needed robustness for critical applications, but it also leads to an NP-hard model that makes training difficult: current exact optimization algorithms show limited scalability, whereas heuristics are not able to find high-quality solutions consistently. Against this background, we propose new integer programming strategies that significantly improve our ability to train the hard-margin SVM model to global optimality. We introduce an iterative sampling and decomposition approach, in which smaller subproblems are used to separate combinatorial Benders’ cuts. Those cuts, used within a branch-and-cut algorithm, permit our solution framework to converge much more quickly toward a global optimum. Through extensive numerical analyses on classical benchmark data sets, our solution algorithm solves, for the first time, 117 new data sets out of 873 to optimality and achieves a reduction of 50% in the average optimality gap for the hardest datasets of the benchmark.
Article
Full-text available
Statistical heterogeneity in Federated Learning (FL) often leads to client drift and biased local solutions. Priorwork in the literature shows that client drift particularly affects the parameters of the classification layer, hindering both convergence and accuracy. While Personalized FL (PFL) addresses this by allowing client-specific models, it can overlook valuable global knowledge. This paper introduces Federated Recursive Ridge Regression (Fed3R), a fast and efficient method to construct a closed-form classifier that effectively incorporates global knowledge while being inherently robust to statistical heterogeneity. Fed3R leverages a pre-trained feature extractor and a recursive ridge regression formulation to achieve exact aggregation of local classifiers and recover the centralized solution. We demonstrate that Fed3R serves as a robust initialization for further fine-tuning with various FL and PFL algorithms, accelerating convergence and boosting performance. Furthermore, we propose Only Local Labels (OLL), a novel PFL technique that simplifies local classifiers by focusing only on locally relevant classes, preventing misclassifications and improving efficiency. Our empirical evaluation on real-world cross-device datasets shows that Fed3R, combined with OLL, significantly improves performance and reduces training costs in heterogeneous FL and PFL scenarios.
Article
A cutting tool plays a crucial role in the material removal process, and effective tool condition monitoring has gained significant attention in the industry. In-process development of any kind of tool fault leads to a reduction in machining accuracy, degradation of surface-finish, and causes interruptions, to name a few. Such faults are untraceable using the conventional condition monitoring approach and need to be addressed smartly. In an attempt to characterize such unknown moments, a machine learning based framework is proposed herein. In order to generate data sets, change in spindle acceleration was acquired for various configurations focusing on failure modes of a tipped tool, during the face milling process. The current investigation focuses on monitoring of defects such as wearing of flank face & nose radius, crater & notch wear, and fracturing of cutting edge. In the beginning, the distinction between the damaged and damaged-free classes was estimated in terms of descriptive statistics and the training dataset was established using 16 features. The logic of the decision tree (DT) has assisted the selection of significant features The SMO algorithm (Sequential Minimal Optimization) is then deployed for training the data through kernels of the SVM (Support vector machine) and the classification model is constructed. Further, robustness analysis is presented to examine the performance of the SMO-SVM model. Finally, tipped tool fault classification for test & blind datasets was carried out considering the proposed framework. The SMO-SVM classifier by considering the ‘polynomial’ & ‘Pearson VII’ kernel functions have exhibited 92.33% classification accuracy thereby confirming the apt training of the model.
Article
Full-text available
Hyperdimensional computing (HD), also known as vector symbolic architectures (VSA), is an emerging and promising paradigm for cognitive computing. At its core, HD/VSA is characterized by its distinctive approach to compositionally representing information using high-dimensional randomized vectors. The recent surge in research within this field gains momentum from its computational efficiency stemming from low-resolution representations and ability to excel in few-shot learning scenarios. Nonetheless, the current literature is missing a comprehensive comparative analysis of various methods since each of them uses a different benchmark to evaluate its performance. This gap obstructs the monitoring of the field’s state-of-the-art advancements and acts as a significant barrier to its overall progress. To address this gap, this review not only offers a conceptual overview of the latest literature but also introduces a comprehensive comparative study of HD/VSA classification methods. The exploration starts with an overview of the strategies proposed to encode information as high-dimensional vectors. These vectors serve as integral components in the construction of classification models. Furthermore, we evaluate diverse classification methods as proposed in the existing literature. This evaluation encompasses techniques such as retraining and regenerative training to augment the model’s performance. To conclude our study, we present a comprehensive empirical study. This study serves as an in-depth analysis, systematically comparing various HD/VSA classification methods using two benchmarks, the first being a set of seven popular datasets used in HD/VSA and the second consisting of 121 datasets being the subset from the UCI Machine Learning repository. To facilitate future research on classification with HD/VSA, we open-sourced the benchmarking and the implementations of the methods we review. Since the considered data are tabular, encodings based on key-value pairs emerge as optimal choices, boasting superior accuracy while maintaining high efficiency. Secondly, iterative adaptive methods demonstrate remarkable efficacy, potentially complemented by a regenerative strategy, depending on the specific problem. Furthermore, we show how HD/VSA is able to generalize while training with a limited number of training instances. Lastly, we demonstrate the robustness of HD/VSA methods by subjecting the model memory to a large number of bit-flips. The results illustrate that the model’s performance remains reasonably stable until the occurrence of 40% of bit flips, where the model’s performance is drastically degraded. Overall, this study performed a thorough performance evaluation on different methods and, on the one hand, a positive trend was observed in terms of improving classification performance but, on the other hand, these developments could often be surpassed by off-the-shelf methods. This calls for better integration with the broader machine learning literature; the developed benchmarking framework provides practical means for doing so.
Article
Full-text available
This study investigates the utilization of three regression models, i.e., Kernel Ridge Regression (KRR), nu-Support Vector Regression (\:{\upnu\:}-SVR), and Polynomial Regression (PR) for the purpose of forecasting the concentration (C) of a drug within a specified environment, relying on the coordinates (x and y). The analyses were carried out for separation of drug from a solution by adsorption process where the concentration of drug was obtained in the solution and the adsorbent via computational fluid dynamics (CFD), and the results of concentration distribution were used or machine learning modeling. The model considered mass transfer and fluid flow equations to determine concentration distribution of solute in the system. The hyperparameter optimization was carried out using the Fruit-Fly Optimization Algorithm (FFOA), a nature-inspired optimization technique. Our results demonstrate the performance of each model in terms of key regression metrics. KRR achieved an R² score of 0.84851, with a Root Mean Square Error (RMSE) of 1.0384E-01 and a Mean Absolute Error (MAE) of 7.27762E-02. ν\:\nu\:-SVR exhibited exceptional accuracy with an R² of 0.98593, accompanied by an RMSE of 3.5616E-02 and an MAE of 1.36749E-02. PR, a traditional regression method, attained an R² score of 0.94077, an RMSE of 7.2042E-02, and an MAE of 4.81533E-02.
Chapter
This chapter gives a thorough examination of kernel functions k(x,x)k(\mathbf {x}, \mathbf {x}') which are the primary driver of a GP model. We review the most common examples of stationary and non-stationary kernel families, discuss kernel composition and survey kernel selection approaches. The latter half of the chapter considers convergence and universal approximation properties of GP surrogates as the training set grows and then reviews links between GPs and stochastic differential equations. The Chapter is accompanied by a Python Jupyter notebook illustrating fitting of different GP kernels and prior mean functions to a synthetic one-dimensional dataset.
Article
Full-text available
This work discusses weighted kernel point projection (WKPP), a new method for embedding metric space or kernel data. WKPP is based on an iteratively weighted generalization of multidimensional scaling and kernel principal component analysis, and one of its main uses is outlier detection. After a detailed derivation of the method and its algorithm, we give theoretical guarantees regarding its convergence and outlier detection capabilities. Additionally, as one of our mathematical contributions, we give a novel characterization of kernelizability, connecting it also to the classical kernel literature. In our empirical examples, WKPP is benchmarked with respect to several competing outlier detection methods, using various different datasets. The obtained results show that WKPP is computationally fast, while simultaneously achieving performance comparable to state-of-the-art methods.
Article
Full-text available
Recently, the importance of analyzing data and collecting valuable insight efficiently has been increasing in various fields. Estimating mutual information (MI) plays a critical role in investigating the relationship among multiple random variables with a nonlinear correlation. Particularly, the task to determine whether they are independent or not is called the independence test, whose core subroutine is estimating MI from given data. It is a fundamental tool in statistics and data analysis that can be applied in a wide range of applications such as hypothesis testing, causal discovery, and more. In this paper, we propose a method for estimating mutual information using the quantum kernel. We investigate the performance under various problem settings, such as different sample sizes or the shape of the probability distribution. As a result, the quantum kernel method showed higher performance than the classical one under the situation that the number of samples is small, the variance is large or the variables possess highly non-linear relationships. We discuss this behavior in terms of the central limit theorem and the structure of the corresponding quantum reproducing kernel Hilbert space.
Article
Full-text available
Computations in high-dimensional spaces can often be realized only approximately, using a certain number of projections onto lower dimensional subspaces or sampling from distributions. In this paper, we are interested in pairs of real-valued functions (F, f) on [0,)[0,\infty ) that are related by the projection/slicing formula F(x)=Eξ[f(x,ξ)]F (\Vert x \Vert ) = {\mathbb {E}}_{\xi } \big [ f \big (|\langle x,\xi \rangle | \big ) \big ] for xRdx\in {\mathbb {R}}^d, where the expectation value is taken over uniformly distributed directions in Rd{\mathbb {R}}^d. While it is known that F can be obtained from f by an Abel-like integral formula, we construct conversely f from given F using their Fourier transforms. First, we consider the relation between F and f for radial functions F()F(\Vert \cdot \Vert ) that are Fourier transforms of L1L^1 functions. Besides d- and one-dimensional Fourier transforms, it relies on a rotation operator, an averaging operator and a multiplication operator to manage the walk from d to one dimension in the Fourier space. Then, we generalize the results to tempered distributions, where we are mainly interested in radial regular tempered distributions. Based on Bochner’s theorem, this includes positive definite functions F()F(\Vert \cdot \Vert ) and, by the theory of fractional derivatives, also functions F whose derivative of order \lfloor \nicefrac {d}{2}\rfloor is slowly increasing and continuous.
Article
Full-text available
Conservation laws are of great theoretical and practical interest. We describe an alternative approach to machine learning conservation laws of finite-dimensional dynamical systems using trajectory data. It is a unique approach based on kernel methods instead of neural networks which leads to lower computational costs and requires a lower amount of training data. We propose the use of an “indeterminate” form of kernel ridge regression where the labels still have to be found by additional conditions. We use a simple approach minimizing the length of the coefficient vector to discover a single conservation law. Published by the American Physical Society 2025
Article
Manifold learning techniques have emerged as crucial tools for uncovering latent patterns in high-dimensional single-cell data. However, most existing dimensionality reduction methods primarily rely on 2D visualization, which can distort true data relationships and fail to extract reliable biological information. Here, we present DTNE (diffusive topology neighbor embedding), a dimensionality reduction framework that faithfully approximates manifold distance to enhance cellular relationships and dynamics. DTNE constructs a manifold distance matrix using a modified personalized PageRank algorithm, thereby preserving topological structure while enabling diverse single-cell analyses. This approach facilitates distribution-based cellular relationship analysis, pseudotime inference, and clustering within a unified framework. Extensive benchmarking against mainstream algorithms on diverse datasets demonstrates DTNE’s superior performance in maintaining geodesic distances and revealing significant biological patterns. Our results establish DTNE as a powerful tool for high-dimensional data analysis in uncovering meaningful biological insights.
Article
In two-stage electricity markets, renewable power producers enter the day-ahead market with a forecast of future power generation and then reconcile any forecast deviation in the real-time market at a penalty. The choice of the forecast model is thus an important strategy decision for renewable power producers as it affects financial performance. In electricity markets with large shares of renewable generation, the choice of the forecast model impacts not only individual performance but also outcomes for other producers. In this paper, we argue for the existence of a competitive regression equilibrium in two-stage electricity markets in terms of the parameters of private forecast models informing the participation strategies of renewable power producers. In our model, renewables optimize the forecast against the day-ahead and real-time prices, thereby maximizing the average profits across the day-ahead and real-time markets. By doing so, they also implicitly enhance the temporal cost coordination of day-ahead and real-time markets. We base the equilibrium analysis on the theory of variational inequalities, providing results on the existence and uniqueness of regression equilibrium in energy-only markets. We also devise two methods to compute regression equilibrium: centralized optimization and a decentralized ADMM-based algorithm.
Article
Control charts are statistical process control tools that aim to analyse and monitor a certain quality characteristic. In some situations, the quality characteristic to be monitored is correlated with one or more control variables, which suggests the use of a regression control chart (RCC). However, outliers are commonly seen in practice, and they can affect the coefficient estimates and control limits of traditional RCCs. For this reason, robust RCCs are strongly recommended, such that outlier observations do not affect model parameter estimates and control limits. This paper considers new robust regression Shewhart‐type control charts based on the exponential‐type kernel and mean absolute deviation estimators. To evaluate the performance of the proposed control charts, we conduct an extensive Monte Carlo simulation study where we discuss the ability of the control charts to detect changes in the process, both in in‐control and out‐of‐control settings, taking into account different percentages of outliers and sample sizes. The numerical results indicate that the proposed robust RCCs performed well compared to their competitors. Finally, to demonstrate the applicability of the control charts proposed in this paper, we present and discuss an empirical application to temperature monitoring data in Sydney, Australia.
Article
Magnetic nanoparticles (NPs) are gaining significant interest in the field of biomedical functional nanomaterials because of their distinctive chemical and physical characteristics, particularly in drug delivery and magnetic hyperthermia applications. In this paper, we experimentally synthesized and characterized new Fe3O4-based NPs, functionalizing its surface with a 5-TAMRA cadaverine modified copolymer consisting of PMAO and PEG. Despite these advancements, many combinations of NP cores and coatings remain unexplored. To address this, we created a new data set of NP systems from public sources. Herein, 11 different AI/ML algorithms were used to develop the predictive AI/ML models. The linear discriminant analysis (LDA) and random forest (RF) models showed high values of sensitivity and specificity (>0.9) in training/validation series and 3-fold cross validation, respectively. The AI/ML models are able to predict 14 output properties (CC50 (μM), EC50 (μM), inhibition (%), etc.) for all combinations of 54 different NP cores classes vs. 25 different coats and vs. 41 different cell lines, allowing the short listing of the best results for experimental assays. The results of this work may help to reduce the cost of traditional trial and error procedures.
Article
Model Order Reduction (MOR) techniques play a crucial role in reducing the computational complexity of high-dimensional mathematical models, enabling efficient simulations and analysis. In recent years, Artificial Intelligence (AI) has emerged as a powerful tool in various domains, including MOR. This survey paper provides an overview of AI-based MOR techniques, exploring how AI methods are being integrated into traditional MOR approaches. Different AI algorithms, such as machine learning, deep learning, and evolutionary computing, and their applications in MOR are discussed in this paper. The advantages, challenges, and future directions of AI-based MOR techniques are also highlighted.
Article
Full-text available
In 2015, all United Nations Member States adopted 17 Sustainable Development Goals (SDGs) for the 2030 agenda. Addressing the issue of employing alternative data sources for exploring aspects of utilizing said goals, this paper explores the Circular Economy dimension within the SDG12 score, focusing on responsible production and consumption and the broader SDG index. Data from LinkedIn are collected, examining profiles, companies, job postings, and services using the keywords ‘Sustainable Development Goals’ and ‘Circular Economy’. Furthermore, the SDG index (including the SDG12 score) for the United States is integrated in the analysis; SDG is a published metric evaluating the progress of sustainable communities within each state. Finally, data on the past five US general elections are retrieved, in order to explore the relationship between SDGs, Circular Economy, and voting behavior. Regression analyses incorporating PCA components and state election data reveal that the LinkedIn-derived SDG and circular economy components exhibit positive impacts on the corresponding indices. Notably, a state’s political inclination toward the Republican or the Democratic parties highlights contrasting effects on the SDG and SDG12 indices, indicating divergent trends based on electoral choices. Overall, this study underscores LinkedIn’s potential as a valuable source for assessing SDG and Circular Economy position in the US, and highlights the interplay between political factors and sustainable communities at state level.
Article
Full-text available
The nonparametric multivariate analysis of variance (NPMANOVA) testing procedure has been proven to be a valuable tool for comparing groups. In the present paper, we propose a kernel extension of this technique in order to effectively confront high-dimensionality, a recurrent problem in many fields of science. The new method is called kernel multivariate analysis of variance (KMANOVA). The basic idea is to take advantage of the kernel framework: we propose to project the data from the original data space to a Hilbert space generated by a given kernel function and then perform the NPMANOVA method in the reproducing kernel Hilbert space (RKHS). Dispersion of the embedded points can be measured by the distance induced by the inner product in the RKHS but also by many other distances best suited in high-dimensional settings. For this purpose, we study two promising distances: a Manhattan-type distance and a distance based on an orthogonal projection of the embedded points in the direction of the group centroids. We show that the NPMANOVA method and the KMANOVA method with the induced distance are essentially equivalent. We also show that the KMANOVA method with the other two distances performs considerably better than the NPMANOVA method. We illustrate the advantages of our approach in the context of genetic association studies and demonstrate its usefulness on Alzheimer’s disease data. We also provide a software implementation of the method that is available on GitHub https://github.com/8699vicente/Kmanova .
Article
Full-text available
Interacting particle systems (IPSs) are a very important class of dynamical systems, arising in different domains like biology, physics, sociology and engineering. In many applications, these systems can be very large, making their simulation and control, as well as related numerical tasks, very challenging. Kernel methods, a powerful tool in machine learning, offer promising approaches for analyzing and managing IPS. This paper provides a comprehensive study of applying kernel methods to IPS, including the development of numerical schemes and the exploration of mean-field limits. We present novel applications and numerical experiments demonstrating the effectiveness of kernel methods for surrogate modelling and state-dependent feature learning in IPS. Our findings highlight the potential of these methods for advancing the study and control of large-scale IPS.
Chapter
Statistics and probability theory complement each other, so do statistical framework and probabilistic framework. This chapter begins with an overview of statistical framework, including statistics and statistical learning. Secondly, we review the two component of classic statistics, namely, descriptive statistics and inferential statistics. We next introduce two statistical inference methods in mathematical statistics, namely, frequentist inference and Bayesian inference. This chapter then discusses statistical learning theory, where statistical models are the cornerstone of statistical learning theory, statistical learning models are the core of statistical learning theory, and growth functions, VC dimension, and Rademacher complexity are the components of statistical learning theory. Parametric and nonparametric models, as well as kernel methods, are also important components of statistical frameworks, and they are detailed in the last two sections of this chapter.
Article
Full-text available
Support vector regression (SVR) is a powerful kernel-based regression prediction algorithm that performs excellently in various application scenarios. However, for real-world data, the general SVR often fails to achieve good predictive performance due to its inability to assess feature contribution accurately. Feature weighting is a suitable solution to address this issue, applying correlation measurement methods to obtain reasonable weights for features based on their contributions to the output. In this paper, based on the idea of a Hilbert–Schmidt independence criterion least absolute shrinkage and selection operator (HSIC LASSO) for selecting features with minimal redundancy and maximum relevance, we propose a novel feature-weighted SVR that considers the importance of features to the output and the redundancy between features. In this approach, the HSIC is utilized to effectively measure the correlation between features as well as that between features and the output. The feature weights are obtained by solving a LASSO regression problem. Compared to other feature weighting methods, our method takes much more comprehensive consideration of weight calculation, and the obtained weighted kernel function can lead to more precise predictions for unknown data. Comprehensive experiments on real datasets from the University of California Irvine (UCI) machine learning repository demonstrate the effectiveness of the proposed method.
Preprint
Root causal gene expression levels – or root causal genes for short – correspond to the initial changes to gene expression that generate patient symptoms as a downstream effect. Identifying root causal genes is critical towards developing treatments that modify disease near its onset, but no existing algorithms attempt to identify root causal genes from data. RNA-sequencing (RNA-seq) data introduces challenges such as measurement error, high dimensionality and non-linearity that compromise accurate estimation of root causal effects even with state-of-the-art approaches. We therefore instead leverage Perturb-seq, or high throughput perturbations with single cell RNA-seq readout, to learn the causal order between the genes. We then transfer the causal order to bulk RNA-seq and identify root causal genes specific to a given patient for the first time using a novel statistic. Experiments demonstrate large improvements in performance. Applications to macular degeneration and multiple sclerosis also reveal root causal genes that lie on known pathogenic pathways, delineate patient subgroups and implicate a newly defined omnigenic root causal model.
Article
Full-text available
Named Entity Recognition (NER) is considered an important subtask in information extraction that aims to identify Named Entities (NM) within a given text and classify them into predefined categories (e.g., person, location, organization, and miscellaneous). The use of an appropriate annotation scheme is crucial to label multi-word NEs and enhance recognition performance. This study investigates the effects of using different annotation schemes on NER systems for the Arabic language. The impact of seven annotation schemes, namely IO, IOB, IOE, IOBE, IOBS, IOES, and IOBES, on Arabic NER is examined by applying conditional random fields, multinomial Naive Bayes, and support vector machine classifiers. The experimental results reveal the importance of selecting an optimal annotation scheme and show that annotating NEs based on the simple IO scheme yields a higher performance in terms of precision, recall, and F-measure compared to the other schemes.
Article
This paper investigates the potential of a fully behavioral approach for the generation of accurate models of digital IC buffers based on conventional kernel regressions. The proposed approach does not assume a specific model structure like the classical two-piece model representation which has been massively used in literature, offering a promising and viable alternative to facilitate the modeling of nonlinear electrical devices. The collected results represent a first proof-of-concept, aimed at demonstrating the strengths of the proposed alternative modeling approach.
ResearchGate has not been able to resolve any references for this publication.