About
112
Publications
39,060
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
742
Citations
Introduction
Evolutionary Computation
Feature Subset Selection
web: http://sabia.tic.udc.es/mgestal
twitter: @mgestal
Current institution
Additional affiliations
November 2005 - present
June 2001 - present
Publications
Publications (112)
Texture information could be used in proteomics to improve the quality of the image analysis of proteins separated on a gel. In order to evaluate the best technique to identify relevant textures, we use several different kernel-based machine learning techniques to classify proteins in 2-DE images into spot and noise. We evaluate the classification...
Parallel to increased consumption of fruit juices over the last years (thanks to their unrivalled nutritional benefits), fraudulent fruit juices can be found sometimes on the food supply chain. Infrared spectrometry (IR) is a fast and convenient technique to perform screening analyses to assess the quantity of pure juice in commercial beverages. Th...
Many analytical problems involve measurement of a large number of experimental variables on a set of samples. Unfortunately, some of them can deteriorate the performance of classification models because not all variables yield the same quality and quantity of information. In this paper four strategies to perform variable selection in mid-infrared s...
The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecu...
The transport of the molecules inside cells is a very important topic, especially in Drug Metabolism. The experimental testing of the new proteins for the transporter molecular function is expensive and inefficient due to the large amount of new peptides. Therefore, there is a need for cheap and fast theoretical models to predict the transporter pr...
FIDO2 authentication is starting to be applied in numerous web authentication services, aiming to replace passwords and their known vulnerabilities. However, this new authentication method has not been integrated yet with network authentication systems. In this paper, we introduce FIDO2CAP: FIDO2 Captive-portal Authentication Protocol. Our proposal...
Poster presented at the "VIII Jornadas Nacionales de Investigación en Ciberseguridad" (JNIC 2023)
Abstract: Most network authentication systems are based on passwords, which have many security vulnerabilities. This poster provides an overview of our integration of security keys based on the WebAuthn standard to a captive portal for phishing-resist...
Radon (Rn) is a biological threat to cells due to its radioactivity. It is capable of penetrating the human body and damaging cellular DNA, causing mutations and interfering with cellular dynamics. Human exposure to high concentrations of Rn should, therefore, be minimized. The concentration of radon in a room depends on numerous factors, such as r...
During the last few years, some of the most relevant IT companies have started to develop
new authentication solutions which are not vulnerable to attacks like phishing. WebAuthn and FIDO authentication standards were designed to replace or complement the de facto and ubiquitous authentication method: username and password. This paper performs an a...
The theoretical prediction of drug-decorated nanoparticles (DDNPs) has become a very important task in medical applications. For the current paper, Perturbation Theory Machine Learning (PTML) models were built to predict the probability of different pairs of drugs and nanoparticles creating DDNP complexes with anti-glioblastoma activity. PTML model...
A fish can be detected by means of artificial vision techniques, without human intervention or handling the fish. This work presents an application for detecting moving fish in water by artificial vision based on the detection of a fish′s eye in the image, using the Hough algorithm and a Feed-Forward network. In addition, this method of detection i...
During the last few years, the FIDO Alliance and the W3C have been working on a new standard called WebAuthn that aims to substitute the obsolete password as an authentication method by using physical security keys instead. Due to its recent design, the standard is still changing and so are the needs for protocol testing. This research has driven t...
Drug-decorated nanoparticles (DDNPs) have important medical applications. The current work combined Perturbation Theory with Machine Learning and Information Fusion (PTMLIF). Thus, PTMLIF models were proposed to predict the probability of nanoparticle–compound/drug complexes having antimalarial activity (against Plasmodium). The aim is to save expe...
Brain Connectome Networks (BCNs) are defined by brain cortex regions (nodes) interacting with others by electrophysiological co-activation (edges). The experimental prediction of new interactions in BCNs represents a difficult task due to the large number of edges and the complex connectivity patterns. Fortunately, we can use another special type o...
Radon gas has been declared a human carcinogen by the United States Environmental Protection Agency (USEPA) and the International Agency for Research on Cancer (IARC). Several studies carried out in Spain highlighted the high radon concentrations in several regions, with Galicia (northwestern Spain) being one of the regions with the highest radon c...
In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences were transformed into molecular descriptors of sequen...
Transcriptome analysis, as a tool for the characterization and understanding of phenotypic alterations in molecular biology, plays an integral role in the understanding of complex, multi-factorial and heterogeneous diseases such as cancer. Profiling of transcriptome is used for searching the genes that show differences in their expression level ass...
Learning a programming language requires a great deal of effort in both the theoretical and practical domains. As far as theory is concerned, a knowledge of the methods, concepts, attributes that are characteristic of the language as well an understanding of the its specific structures and peculiarities is required. On the other hand, mastering the...
Data mining and data classification over biomedical data are two of the most important research fields in computer science. Among the great diversity of techniques that can be used for this purpose, Artifical Neural Networks (ANNs) is one of the most suited. One of the main problems in the development of this technique is the slow performance of th...
Nowadays biomedical research is generating huge amounts of omic data, covering all levels of genetic information from nucleotide sequencing to protein metabolism. In the beginning, data were analyzed independently losing a great deal of essential information in the models. Even so, complex metabolic routes and genetic diseases could be determined....
Vertical slot fishways are hydraulic structures which allow the upstream migration of fish through obstructions in rivers. Their design depends on the interplay between hydraulic and biological variables to match the requirements of the fish species for which they are intended. However, current mechanisms to study fish behavior in fishway models ar...
The design of experiments and the validation of the results achieved with them are vital in any research study. This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as...
Original datasets, raw data results, summary file results for each dataset separated in folders
Datailed results from UC Irvine Machine Learning Repository (Housing, Machine CPU, Wine Quality, Automobile and Parkinson) and the 3 Use Cases (Protein Corona, Gajewicz Metal Oxides and Aquatic Toxicity)
Purpose
– The purpose of this paper is to assess the quality of commercial lubricant oils. A spectroscopic method was used in combination with multivariate regression techniques (ordinary multivariate multiple regression, principal components analysis, partial least squares, and support vector regression (SVR)).
Design/methodology/approach
– The r...
El framework J2EE ha sido el gran dominador, durante mucho tiempo, en el desarrollo de aplicaciones empresariales. Esto hecho originó la aparición de un rico ecosistema de herramientas, manuales, tutoriales, etc., que explican las diferentes alternativas o peculiaridades a la hora de su implementación. La irrupción de .NET Framework, en el ámbito e...
Experimental analysis starts with very similar premises: given a specific problem, we need to either collect or generate a dataset and to choose the best model according to the performance. A set of techniques can be evaluated (i.e. statistical or metaheuristic approaches) as well as results from previous works that should be taken into account. Th...
Genetic algorithms are search and optimization techniques which have their origin and inspiration in the world of biology. They provide very good results in different kind of problems, but they are not free of complications. One of the most common problems that may arise with these techniques is that, despite a few generations obtain an approximati...
Data mining and data classification over biomedical data are two of the most important research fields in computer science. Among the great diversity of techniques that can be used for this purpose, Artifical Neural Networks (ANNs) is one of the most suited. One of the main problems in the development of this technique is the slow performance of th...
El framework J2EE ha sido el gran dominador, durante mucho tiempo, en el desarrollo de aplicaciones empresariales. Esto hecho originó la aparición de un rico ecosistema de herramientas, manuales, tutoriales, etc., que explican las diferentes alternativas o peculiaridades a la hora de su implementación. La irrupción de .NET Framework, en el ámbito e...
The interpretation of the results in a classification problem can be enhanced, specially in image texture analysis problems, by feature selection techniques, knowing which features contribute more to the classification performance. This paper presents an evaluation of a number of feature selection techniques for classification in a biomedical image...
The cell death (CD) is a dynamic biological function involved into physiological and pathological processes. Due to the complexity of the CD, it is a demand for fast theoretical methods that can help to find new CD molecular targets. The current work presents the first classification model to predict CD-related proteins based on Markov Mean Propert...
The enzyme regulation proteins are very important due to its involvement in many biological processes that is sustaining the life. The complexity of these proteins, the impossibility to identify direct quantification molecular properties associated with the regulation of enzymatic activities, and the structural diversity of them creates the necessi...
GIVEN THE BACKGROUND OF THE USE OF NEURAL NETWORKS IN PROBLEMS OF APPLE JUICE CLASSIFICATION, THIS PAPER AIM AT IMPLEMENTING A NEWLY DEVELOPED METHOD IN THE FIELD OF MACHINE LEARNING: the Support Vector Machines (SVM). Therefore, a hybrid model that combines genetic algorithms and support vector machines is suggested in such a way that, when using...
There are several different types of medical imaging modalities, among others magnetic resonance imaging (MRI), positron emission tomography (PET), ultrasound, computed tomography (CT) or two-dimensional electrophoresis images (2D-electrophoresis). The number of images is increasing rapidly and the development of automatic image processing systems...
The huge efforts made currently by atomic spectroscopists to resolve interferences and optimise instrumental measuring devices to increase accuracy and precision have led to a point where many of the difficulties that need to be solved nowadays cannot be described by simple classical linear regression methods and not even by other advanced linear r...
In this paper, a high-dimensional textural heterogenous dataset is evaluated. This problem should be studied with specific techniques or a solution for decreasing dimensionality should be applied in order to improve the classi- fication results. Thus, this problem is tackled by means of three differente techniques: an specific technique such as Mul...
This paper describes a new technique for signal classification by means of Genetic Programming (GP). The novelty of this technique is that no prior knowledge of the signals is needed to extract the features. Instead of it, GP is able to extract the most relevant features needed for classification. This technique has been applied for the solution of...
Schizophrenia is a complex disease, with both genetic and environmental influence. Machine learning techniques can be used to associate different genetic variations at different genes with a (schizophrenic or non-schizophrenic) phenotype. Several machine learning techniques were applied to schizophrenia data to obtain the results presented in this...
In this paper, the influence of textural information is studied in two-dimensional electrophoresis gel images. A Genetic Algorithm-based feature selection technique is used in order to select the most representative textural features and reduced the original set (296 feat.) to a more efficient subset. Such a method makes use of a Support Vector Mac...
ANNs are one of the most successful learning systems. For this reason, many techniques have been published that allow the obtaining of feed-forward networks. However, few works describe techniques for developing recurrent networks. This work uses a genetic algorithm for automatic recurrent ANN development. This system has been applied to solve a we...
In this paper, the influence of textural information is studied in two-dimensional electrophoresis gel images. A Genetic Algorithm-based feature selection technique is used in order to select the most representative textural features and reduced the original set (296 feat.) to a more efficient subset. Such a method makes use of a Support Vector Mac...
Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of...
Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of biomedical data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of...
Over recent years, Genetic Algorithms have proven to be an appropriate tool for solving certain problems. However, it does not matter if the search space has several valid solutions, as their classic approach is insufficient. To this end, the idea of dividing the individuals into species has been successfully raised.
However, this solution is not f...
Lipid-Binding Proteins (LIBPs) or Fatty Acid-Binding Proteins (FABPs) play an important role in many diseases such as different types of cancer, kidney injury, atherosclerosis, diabetes, intestinal ischemia and parasitic infections. Thus, the computational methods that can predict LIBPs based on 3D structure parameters became a goal of major import...
Resumen-En este trabajo se propone un algoritmo para analizar las trayectorias de los peces en el interior de escalas de hendidura vertical, construcciones hidráulicas que permiten a los peces sortear estructuras como presas que obstaculizan los procesos naturales en los ríos. Con la técnica propuesta se pretende estudiar el comportamiento de los p...
Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of biomedical data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of...
Lipid-Binding proteins (LIBPs) or Fatty-Acid Binding Proteins (FABPs) play an important role in many diseases such as different types of cancer, kidney injury, atherosclerosis, diabetes, intestinal ischemia and parasitic infections. Thus, the computational methods that can predict LIBPs based on 3D structure parameters became a goal of major import...
Over recent years, Genetic Algorithms have proven to be an appropriate tool for solving certain problems. However, it does not matter if the search space has several valid solutions, as their classic approach is insufficient. To this end, the idea of dividing the individuals into species has been successfully raised. However, this solution is not f...
Nature has proved to be the best testing system, where we can analyze the effectiveness of any method of solving problems. It provides one of the most complex problems to be resolved: the survival. Analyzing how the species behave to achieve that survival, soft computing methods try to mimic this behavior to provide meaningful solutions to diverse...
Genetic Algorithms (GAs) are a technique that has given good results to those problems that require a search through a complex space of possible solutions. A key point of GAs is the necessity of maintaining the diversity in the population. Without this diversity, the population converges and the search prematurely stops, not being able to reach the...
This chapter presents a soft computing system developed to optimize the laser milling manufacture of high value steel components, a relatively new and interesting industrial technique. This applied research presents a multidisciplinary study based on the application of unsupervised neural projection models in conjunction with identification systems...
Controlling a biped robot with several degrees of freedom is a challenging task that takes the attention of several researchers in the fields of biology, physics, electronics, computer science and mechanics. For a humanoid robot to perform in complex environments, fast, stable and adaptive behaviors are required. This paper proposes a solution for...
This is the first book for atomic spectroscopists to present the basic principles of experimental designs, optimization and multivariate regression. Multivariate regression is a valuable statistical method for handling complex problems (such as spectral and chemical interferences) which arise during atomic spectrometry. However, the technique is un...
Traditionally, the Evolutionary Computation (EC) techniques, and more specifically the Genetic Algorithms (GAs), have proved to be efficient when solving various problems; however, as a possible lack, the GAs tend to provide a unique solution for the problem on which they are applied. Some non global solutions discarded during the search of the bes...
Traditionally, the Evolutionary Computation (EC) techniques, and more specifically the Genetic Algorithms (GAs) (Goldberg & Wang, 1989), have proved to be efficient when solving various problems; however, as a possible lack, the GAs tend to provide a unique solution for the problem on which they are applied. Some non global solutions discarded duri...
This work will tackle with the adaptation process to the European Higher Education Area (EHEA) for the Artificial Neural Networks subject (ANN) included in the Computing Sciences Master Degree of the University of A Coruña. As this subject was originally part of the catalogue of optional subjects offered at the Faculty of Computer Sciences, it has...
The importance of juice beverages in daily food habits makes juice authentication an important issue, for example, to avoid fraudulent practices. A successful classification model should address two important cornerstones of the quality control of juicebased beverages: to monitor the amount of juice and to monitor the amount (and nature) of other s...
Traditionally, the Evolutionary Computation (EC) techniques, and more specifically the Genetic Algorithms (GAs), have proved to be efficient when solving various problems; however, as a possible lack, the GAs tend to provide a unique solution for the problem on which they are applied. Some non global solutions discarded during the search of the bes...
This chapter shows several approaches to determine how the most relevant subset of variables can perform a classification task. It will permit the improvement and efficiency of the classification model. A particular technique of evolutionary computation, the genetic algorithms, is applied which aim to obtain a general method of variable selection w...
Traditionally, the Evolutionary Computation (EC) techniques, and more specifically the Genetic Algorithms (GAs), have proved to be efficient when solving various problems; however, as a possible lack, the GAs tend to provide a unique solution for the problem on which they are applied. Some non global solutions discarded during the search of the bes...
Four genetic-algorithm-based approaches to variable selection in spectral data sets are presented. They range from a pure black-box approach to a chemically driven one. The latter uses a fitness function that takes into account not only typical parameters like the number of errors when classifying a training set but also the chemical interpretabili...
2.1 INTRODUCCIÓN A medida que transcurre el tiempo, la complejidad de los problemas abordados desde las diferentes ramas de la Ciencia, crece de manera ininterrumpida. Parejo a este crecimiento va el tiempo y el esfuerzo requerido para la resolución de dichos problemas mediante técnicas clásicas, bien porque en principio no se conoce la manera de o...
This paper presents a new model for computational embryology that mimics tehe behaviour of biological cells, whose characteristics can be applied to the solution of computational problems. The presented tests apply the model to simple structure generation and provide promising results with regard to its behaviour and applicability to more complex p...
This paper presents a study in which a new technique for automati- cally developing Artificial Neural Networks (ANNs) by means of Evolutionary Computation (EC) tools is compared with the traditional evolutionary tech- niques used for ANN development. The technique used here is based on net- work encoding on graphs and also their performance and evo...
This paper presents a new model that can be framed into the so known as Computational Embryology or Natural Computation. This discipline takes the behaviour of biological cells and tries to adapt some of their characteristics to the artificial cells in order to solve computational problems. Besides the theoretical approach, some of the tests that w...
Artificial Neural Networks have achieved satisfactory results in different fields such as example classification or image identification. Real-world processes usually have a temporal evolution, and they are the type of processes where Recurrent Networks have special success. Nevertheless they are still reluctantly used, mainly due to the fact that...
This paper presents a new model that can be framed into the so known as Computational Embryology or Natural Computation. This discipline takes the behaviour of biological cells and tries to adapt some of their characteristics to the artificial cells in order to solve computational problems. Besides the theoretical approach, some of the tests that w...
This chapter shows several approaches to determine how the most relevant subset of variables can perform a classification task. It will permit the improvement and efficiency of the classification model. A particular technique of evolutionary computation, the genetic algorithms, is applied which aim to obtain a general method of variable selection w...
Being based on the theory of evolution and natural selection, the Genetic Algorithms (GA) represent a technique that has been
proved as good enough for the resolution of those problems that require a search through a complex space of possible solutions.
The maintenance of a population of possible solutions that are in constant evolution may lead to...
The importance of fruit beverages, and of apple juice in particular, in daily food habits makes juice authentication an important issue in order to avoid fraudulent practices and to protect human health. Among the instrumental techniques available in analytical laboratories, infrared spectrometry (IR) is a fast and convenient technique to perform s...