About
177
Publications
14,440
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,729
Citations
Citations since 2017
Publications
Publications (177)
The number of connected embedded edge computing Internet of Things (IoT) devices has been increasing over the years, contributing to the significant growth of available data in different scenarios. Thereby, machine learning algorithms arise to enable task automation and process optimization based on those data. However, due to some learning methods...
Detecting anomalies in industrial processes is a critical task. Prior fault detection can reduce company costs, and most importantly, may prevent accidents and environmental damage. Anomaly detection can be treated as drift detection, as both aims at identifying changes in data that happen unexpectedly over time. In this paper, a graph-based approa...
Performance metrics are usually evaluated only after the neural network learning process using an error cost function. This procedure can result in suboptimal model selection, particularly for imbalanced classification problems. This work proposes the direct use of these metrics as cost functions, which are often derived from the confusion matrix....
It is well known that there is an increasing interest in edge computing to reduce the distance between cloud and end devices, especially for Machine Learning (ML) methods. However, when related to latency-sensitive applications, little work can be found in ML literature on suitable embedded systems implementations. This paper presents new ways to i...
Este trabalho apresenta uma metodologia para classificação de dados, em duas etapas, utilizando uma técnica de aprendizado ativo. A primeira etapa da metodologia consiste em determinar um ponto de partida para a escolha dos dados iniciais a serem rotulados. Na segunda etapa, o algoritmo de aprendizado ativo é aplicado aos dados. Uma vez que a rotul...
A bottleneck of laboratory analysis in process industries including steelmaking plants is the low sampling rate. Inference models using only variables measured online have then been used to made such information available in advance. This study develops predictive models for key mechanical properties of seamless steel tubes, by strength, ultimate t...
This work proposes a new algorithm for training neural networks to solve the problems of feature selection and function approximation. The algorithm applies different weight constraint functions for the hidden and the output layers of a multilayer perceptron neural network. The LASSO operator is applied to the hidden layer; therefore, the training...
Early detection of technical errors in medical examinations, especially in remote locations, is of utmost importance in order to avoid invalid measurements that would require costly and time consuming repetitions. This paper proposes a highly efficient method for the identification of an erroneous inversion of the measuring electrodes during a mult...
This paper presents a novel approach to deal with the imbalanced data set problem in neural networks by incorporating prior probabilities into a cost-sensitive cross-entropy error function. Several classical benchmarks were tested for performance evaluation using different metrics, namely G-Mean, area under the ROC curve (AUC), adjusted G-Mean, Acc...
Identification of piecewise affine hybrid systems is not an easy task since both the parameters determining the discrete modes and the submodels parameters have to be obtained, leading to a non‐convex combinatorial optimisation problem. One way around this problem is to solve it in two steps. This work presents an approach for simultaneously estima...
This paper presents a new relevance index based on mutual information that is based on labeled and unlabeled data. The proposed index, which is based in Mutual Information, takes into account the similarity between features and their joint influence on the output variable. Based on this principle, a method to select features is developed to elimina...
Failure prediction based on the anomaly detection/forecasting is becoming a reality thanks to the introduction of machine learning techniques. The orchestration layer can leverage on this new feature to proactively reconfigure cloud services that might find themselves traversing an element that is about to fail. As a result, the number of cloud ser...
Este trabalho apresenta uma proposta de método map-matching online para pré-processamento de trajetórias de veículos. O método uti-liza um modelo logístico para determinar a probabilidade de localização de uma trajetória pertencer a um segmento que representa uma rua ou estrada. A avaliação do modelo foi feita em duas etapas. A primeira ava-lia a c...
Este trabalho apresenta a proposta de um classificador incremental supervisionado, baseado na formulação geométrica da SVM, utilizando o método Simplex para solucionar problemas de programação linear modelados em cada uma das classes. São procurados pontos ótimos em cada um dos conjuntos de dados, estes pontos são utilizados para calcular o hiperpl...
This work presents a new clustering algorithm, the GPIC, a Graphics Processing Unit (GPU) accelerated algorithm for Power Iteration Clustering (PIC). Our algorithm is based on the original PIC proposal, adapted to take advantage of the GPU architecture, maintaining the algorithm’s original properties. The proposed method was compared against the se...
Multiple Instance Learning (MIL) is a recent paradigm of learning, which is based on the assignment of a single label to a set of instances called bag. A bag is positive if it contains at least one positive instance, and negative otherwise. This work proposes a new algorithm based on likelihood computation by means of Kernel Density Estimation (KDE...
This work presents a novel approach for decision-making for multi-objective binary classification problems. The purpose of the decision process is to select within a set of Pareto-optimal solutions, one model that minimizes the structural risk (generalization error). This new approach utilizes a kind of prior knowledge that, if available, allows th...
This work presents an on-chip learning of artificial neural networks in a FPGA multiprocessor system, where each neuron is implemented in a soft-core processor. In order to take maximum advantage of the distributed architecture, a pipelined version of the on-line back-propagation algorithm is used, providing a high degree of parallelism between neu...
Resumo. Este trabalho apresenta um sistema de reconhecimento de caracteres numéricos em placas de automóveis. O sistema contém três módulos principais: o primeiro móduló e res-ponsável pela localizaçlocalizaç˜localização da placa a partir da aplicaçaplicaç˜aplicação de um conjunto de operadores mor-fológicos, tais como as transformadas White Top-Ha...
Instantaneous measurements of process variables are usually not representative of the process effects as a whole when defining the condition of an output sample mainly in case of laboratory analysis. Moreover, process data have considerable dispersion. This leads to uncertainty in input–output time alignment and in variable relationship. This work...
Currently Mutual Information has been widely used in pattern recognition and feature selection problems. It may be used as a measure of redundancy between features as well as a measure of dependency evaluating the relevance of each feature. Since marginal densities of real datasets are not usually known in advance, mutual information should be eval...
A combinação de modelos é uma estratégia que vem sendo bastante utilizada para aumentar o desempenho na modelagem. A premissa que se tem é de que combinar decisões independentes e diferentes de modelos individuais faz com que erros aleatórios se cancelem e decisões corretas sejam reforçadas. Um ensemble baseia-se, então na ideia de diversidade: qua...
This work presents a new clustering algorithm, the GPIC, a Graphics Processing Unit (GPU) accelerated algorithm for Power Iteration Clustering (PIC). Our algorithm is based on the original PIC proposal, adapted to take advantage of the GPU architecture, maintining the algorith original properties. The proposed method was compared against the serial...
Provides a listing of current staff, committee members and society officers.
A new learning method for classification problems that is suitable for integrated circuit implementation is presented. The method, which outperforms current approaches in many data sets, is based on a structural description of the learning set represented by a planar graph. The final classification function is composed of a hierarchical mixture of...
This paper presents an alternative method based on Petri nets for calculating the Area Under the ROC Curve (AUC). The proposed method enables a more accurate visualization and an adjustable numerical precision. Experiments with Multi Layer Perceptron Neural Networks applied to a UCI's spam detection database showed that our strategy is promising. W...
Big Data problems demand data models with abilities to handle time-varying, massive, and high dimensional data. In this context, Active Learning emerges as an attractive technique for the development of high performance models using few data. The importance of Active Learning for Big Data becomes more evident when labeling cost is high and data is...
Background: Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly ever explicit.
Method: We addressed the computatio...
This paper proposes a novel regularization approach for Extreme Learning Machines. Regularization is performed using a priori spacial information expressed by an affinity matrix. We show that the use of this type of a priori information is similar to perform Tikhonov regularization. Furthermore, if a parameter free affinity matrix is used, like the...
This work proposes a sequential methodology for selecting variables in classification problems in which the number of predictors is much larger than the sample size. The methodology includes a Monte Carlo permutation procedure that conditionally tests the null hypothesis of no association among the outcomes and the available predictors. In order to...
Downflow Lo-Solids cooking technology was adopted at Arauco Mill in an attempt to reduce the variability in kappa number. In order to increase understanding of the cooking parameters on the final kappa number, a model was developed based on data extracted from the continuous digester at Arauco mill's line 2. Several modelling techniques were tested...
The present work has the objective to evaluate two methods of pattern classification based on maximum margin of computational geometry, also comparing them with the Support Vector Machine, which is now considered state of the art for it. A way to reduce cyclomatic complexity based on the Gabriel graph algorithm to improve its performance is also pr...
The RBF network is commonly used for classification and function approximation. The center and radius of the activation function of neurons is an important parameter to be found before the network training. This paper presents a method based on computational geometry to find these coefficients without any parameters provided by the user. The method...
Introduction: Function induction problems are frequently represented by affinity measures between the elements of the inductive sample set, and kernel matrices are a well-known example of affinity measures. Methods: The objective of the present work is to obtain information about the relations between data from a calculated kernel matrix by initial...
INTRODUCTION: Function induction problems are frequently represented by affinity measures between the elements of the inductive sample set, and kernel matrices are a well-known example of affinity measures. METHODS: The objective of the present work is to obtain information about the relations between data from a calculated kernel matrix by initial...
Acute leukemia classification into its Myeloid and Lymphoblastic subtypes is usually accomplished according to the morphological appearance of the tumor. Nevertheless, cells from the two subtypes can have similar histopathological appearance, which makes screening procedures very difficult. Correct classification of patients in the initial phases o...
This paper presents a new validity index for fuzzy partitions generated by the fuzzy c-means algorithm. The proposed validity index is based on the calculation of factors from the proximity matrix generated from the membership matrix generated by a fuzzy clustering partition algorithm, such as FCM. The experimental results show that the proposed ap...
The aim of this study was to present a new training algorithm using artificial neural networks called multi-objective least absolute shrinkage and selection operator (MOBJ-LASSO) applied to the classification of dynamic gait patterns. The movement pattern is identified by 20 characteristics from the three components of the ground reaction force whi...
Traditional learning algorithms applied to complex and highly imbalanced training sets may not give satisfactory results when distinguishing between examples of the classes. The tendency is to yield classification models that are biased towards the overrepresented (majority) class. This paper investigates this class imbalance problem in the context...
This paper presents an evolutionary wrapper method for
feature selection that uses a non-parametric density estimation method
and a Bayesian Classifier. Non-parametric methods are a good alternative
for scarce and sparse data, as in Bioinformatics problems, since they do
not make any assumptions about its structure and all the information
come from...
This paper describes the use of a computer vision technique called template matching in order to verify the efficiency of the same on a task that is to assist in the navigation of a UAV. Will detail the work methodology used to carry out the experiments, and then the results will be shown and factors of improvements in the technique of template mat...
This paper investigates the use of the Area Under the ROC Curve (AUC) as an alternative criteria for model selection in classification problems with unbalanced datasets. A novel algorithm, named here as AUCMLP, which incorporates AUC optimization into the Multi-layer Perceptron (MLPs) learning process is presented. The basic principle of AUCMLP is...
This paper presents a Pareto-optimal selection strategy for multiobjective learning that is based on the geometry of the separation margin between classes. The Gabriel Graph, a method borrowed from Computational Geometry, is constructed in order to obtain margin patterns and class borders. From border edges, a target separator is obtained in order...
Semi-supervised clustering aims at accomplishing the clustering task by considering also labels or constraints provided by an external agent. Usually, the agent would provide the output label for a reduced number of patterns or, in the case of lack of posterior information about labels, some pairwise constraints indicating whether or not two patter...
This chapter gives a general overview of Artificial Neural Networks Learning from the perspectives of Statistical Learning Theory and Multi-objective Optimization. Both approaches treat the general learning problem as a trade-off between the empirical risk obtained from the data set and the model complexity. Learning is seen as a problem of fitting...
The Pareto-optimality concept is used in this paper in order to represent a constrained set of solutions that are able to trade-off the two main objective functions involved in neural networks supervised learning: data-set error and network complexity. The neural network is described as a dynamic system having error and complexity as its state vari...
Traditional learning algorithms induced by complex and highly imbalanced training sets may have difficulty in distinguishing between examples of the groups. The tendency is to create classification models that are biased toward the overrepresented (majority) class, resulting in a low rate of recognition for the minority group. This paper provides a...
Multi-objective learning has been explored in neural network because it adjusts the model capacity providing better generalization properties. It usually requires sophisticated algorithms such as ellipsoidal, sliding-mode, genetic algorithms, among others. This paper proposes an affordable algorithm that decomposes the gradient into two components...
This paper presents two new approaches for constructing an ensemble of neural networks (NN) using coevolution and the artificial
immune system (AIS). These approaches are extensions of the CLONal Selection Algorithm for building ENSembles (CLONENS) algorithm.
An explicit diversity promotion technique was added to CLONENS and a novel coevolutionary...
Based on the Theory of Neuronal Group Selection (TNGS), we have investigated the emergence of synchronicity in a network composed of spiking neurons via genetic algorithm. The TNGS establishes that a neuronal group is the most basic unit in the cortical area of the brain and, as a rule, it is not formed by a single neuron, but by a cluster of tight...
Breast cancer is the second most frequent one, and the first one affecting the women. The standard treatment has three main
stages: a preoperative chemotherapy followed by a surgery operation, then an post-operatory chemotherapy. Because the response
to the preoperative chemotherapy is correlated to a good prognosis, and because the clinical and bi...
A large number of unclassified sequences is still found in public databases, which suggests that there is still need for new investigations in the area. In this contribution, we present a methodology based on Artificial Neural Networks for protein functional classification. A new protein coding scheme, called here Extended-Sequence Coding by Slidin...
Nous proposons un algorithme de sélection de caractéristiques (feature selection) à haute dimension et son application en onco-pharmacogénomique pour le cancer du sein. Dans cette application, nous devons sélectionner un ensemble de gènes dont les niveaux d'expression permettent une prédiction efficace de la réponse des patientes à un traitement de...
The use of auxiliary information during the identifi- cation of nonlinear systems can be handled in different ways and at different levels. In this brief, static information of a 15 kW hy- draulic pumping system is used as a priori knowledge in the pa- rameters estimation of polynomial models which are compared to polynomial and neural models obtai...
www.uclouvain.be Abstract. This paper presents a Semi-Supervised Feature Selection Method based on a univariate relevance measure applied to a multiobjective approach of the problem. Along the process of decision of the optimal solution within Pareto-optimal set, atempting to maximize the relevance indexes of each feature, it is possible to