María Martínez Ballesteros

María Martínez Ballesteros
  • Computer Science, PhD
  • Professor (Associate) at University of Seville

Associate professor in Computer Science, Coordinator Master's in Computer Engineering

About

61
Publications
12,460
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
752
Citations
Introduction
I specialize in Machine Learning, with an emphasis on unsupervised learning algorithms such as association rules. My work spans domains like bioinformatics, environment, and time series analysis. My recent publications are related to new deep learning architectures for time series prediction of wind energy, electricity, etc., explainability in machine learning models, and applications of machine learning in genomic cancer data.
Current institution
University of Seville
Current position
  • Professor (Associate)

Publications

Publications (61)
Article
Full-text available
This paper introduces a new, model-independent, metric, called RExQUAL, for quantifying the quality of explanations provided by attribution-based explainable artificial intelligence techniques and compare them. The underlying idea is based on feature attribution, using a subset of the ranking of the attributes highlighted by a model-agnostic explai...
Chapter
Reference evapotranspiration is a crucial metric in agricultural contexts, characterizing the evapotranspiration rate from a well-hydrated surface and serving as a fundamental benchmark for water management and crop irrigation, especially in arid regions. This study applied neural networks that integrates a Temporal Selection Layer to enhance the p...
Article
Feature selection is a widely studied technique whose goal is to reduce the dimensionality of the problem by removing irrelevant features. It has multiple benefits, such as improved efficacy, efficiency and interpretability of almost any type of machine learning model. Feature selection techniques may be divided into three main categories, dependin...
Article
Time series forecasting is a well-known deep learning application field in which previous data are used to predict the future behavior of the series. Recently, several deep learning approaches have been proposed in which several nonlinear functions are applied to the input to obtain the output. In this paper, we introduce a novel method to improve...
Chapter
Traditional time series forecasting models often use all available variables, including potentially irrelevant or noisy features, which can lead to overfitting and poor performance. Feature selection can help address this issue by selecting the most informative variables in the temporal and feature dimensions. However, selecting the right features...
Chapter
This paper explores the use of deep learning techniques for detecting sleep apnea. Sleep apnea is a common sleep disorder characterized by abnormal breathing pauses or infrequent breathing during sleep. The current standard for diagnosing sleep apnea involves overnight polysomnography, which is expensive and requires specialized equipment and perso...
Chapter
This paper proposes a novel approach that combines an association rule algorithm with a deep learning model to enhance the interpretability of prediction outcomes. The study aims to gain insights into the patterns that were learned correctly or incorrectly by the model. To identify these scenarios, an association rule algorithm is applied to extrac...
Chapter
Sarcomas are rare mesodermal tumors of heterogeneous nature and have a higher incidence in children. The relative 5-year survival rate for patients with metastatic sarcoma is usually low. Standard treatment for sarcomas involves surgical resection, and investigating the genetic basis of these tumors through genome-wide analysis is crucial due to th...
Chapter
Renewable energies are currently experiencing promising growth as an alternative solution to minimize the emission of pollutant gases from the use of fossil fuels, which contribute to global warming. To integrate these renewable energies safely with the grid system and make the electric grid system more stable, it is vitally important to accurately...
Chapter
The quality of university teaching is essential for the success of students and the academic excellence of an educational institution. The purpose of this work is to provide a methodology based on the Association Rule technique using the Apriori algorithm to analyze the results obtained from the student evaluation process regarding their satisfacti...
Conference Paper
The year 2022 was the driest year in Portugal since 1931 with 97% of territory in severe drought. Water is especially important for the agricultural sector in Portugal, as it represents 78% total consumption according to the Water Footprint report published in 2010. Reference evapotranspiration is essential due to its importance in optimal irrigati...
Conference Paper
Deep learning has become one of the most useful tools in the last years to mine information from large datasets. Despite the successful application to many research fields, deep learning is known as a black box approach and most experts experience difficulties to explain and interpret deep learning results. In this context, explainable artificial i...
Article
Full-text available
Sarcomas are a group of heterogeneous mesodermic rare tumors with high incidence in children, reaching up to 20% of neoplasms. Standard treatment for sarcomas is surgical resection, and only some patients are treated with chemotherapy and/or radiation therapy. The 5-year relative survival rate for patients with metastatic sarcoma is only 15%. There...
Article
Full-text available
Ensuring the optimal performance of power transformers is a laborious task in which the insulation system plays a vital role in decreasing their deterioration. The insulation system uses insulating oil to control temperature, as high temperatures can reduce the lifetime of the transformers and lead to expensive maintenance. Deep learning architectu...
Article
Full-text available
Time series is one of the most common data types in the industry nowadays. Forecasting the future of a time series behavior can be useful in planning ahead, saving time, resources, and helping avoid undesired scenarios. To make the forecasting, historical data is utilized due to the causal nature of the time series. Several deep learning algorithms...
Article
Full-text available
Renewable energies, such as solar and wind power, have become promising sources of energy to address the increase in greenhouse gases caused by the use of fossil fuels and to resolve the current energy crisis. Integrating wind energy into a large-scale electric grid presents a significant challenge due to the high intermittency and nonlinear behavi...
Conference Paper
Full-text available
Machine and deep learning has become one of the most useful tools in the last years as a diagnosis-decision-support tool in the health area. However, it is widely known that artificial intelligence models are considered a black box and most experts experience difficulties explaining and interpreting the models and their results. In this context, ex...
Article
Machine and deep learning has become one of the most useful tools in the last years as a diagnosis-decision-support tool in the health area. However, it is widely known that artificial intelligence models are considered a black box and most experts experience difficulties explaining and interpreting the models and their results. In this context, ex...
Chapter
Neural networks have proven to be a good alternative in application fields such as healthcare, time-series forecasting and artificial vision, among others, for tasks like regression or classification. Their potential has been particularly remarkable in unstructured data, but recently developed architectures or their ensemble with other classical me...
Article
Full-text available
https://authors.elsevier.com/a/1c0AE3KEGaD6fQ Breast cancer is the most frequent cancer in women and the second most frequent overall after lung cancer. Although the 5-year survival rate of breast cancer is relatively high, recurrence is also common which often involves metastasis with its consequent threat for patients. DNA methylation-derived da...
Article
Full-text available
Unemployment in Spain is one of the biggest concerns of its inhabitants. Its unemployment rate is the second highest in the European Union, and in the second quarter of 2018 there is a 15.2% unemployment rate, some 3.4 million unemployed. Construction is one of the activity sectors that have suffered the most from the economic crisis. In addition,...
Article
https://authors.elsevier.com/c/1Yg1X4ZQDzkEk Clustering is one of the most commonly used techniques in data mining. Its main goal is to group objects into clusters so that each group contains objects that are more similar to each other than to objects in other clusters. The evaluation of a clustering solution is a task carried out through the appl...
Conference Paper
Full-text available
El clustering es una de las técnicas más utilizadas en minería de datos. Tiene como objetivo principal agrupar datos en clusters de manera que los objetos que pertenecen al mismo clúster sean más similares que los que pertenecen a diferentes clusters. La validación de un clustering es una tarea que se realiza aplicando los llamados índices de valid...
Article
Many algorithms have emerged to address the discovery of quantitative association rules from datasets in the last years. However, this task is becoming a challenge because the processing power of most existing techniques is not enough to handle the large amount of data generated nowadays. These vast amounts of data are known as Big Data. A number o...
Article
Clustering analysis is one of the most used Machine Learning techniques to discover groups among data objects. Some clustering methods require the number of clusters into which the data is going to be partitioned. There exist several cluster validity indices that help us to approximate the optimal number of clusters of the dataset. However, such in...
Article
Breast cancer is the most common cause of cancer death in women. Today, post-transcriptional protein products of the genes involved in breast cancer can be identified by immunohistochemistry. However, this method has problems arising from the intra-observer and inter-observer variability in the assessment of pathologic variables, which may result i...
Article
Alzheimer’s disease is a complex progressive neurodegenerative brain disorder, being its prevalence expected to rise over the next decades. Unconventional strategies for elucidating the genetic mechanisms are necessary due to its polygenic nature. In this work, the input information sources are five: a public DNA microarray that measures expression...
Article
Full-text available
This work aims at correcting flaws existing in multi-objective evolutionary schemes to discover quantitative association rules, specifically those based on the well-known non-dominated sorting genetic algorithm-II (NSGA-II). In particular, a methodology is proposed to find the most suitable configurations based on the set of objectives to optimize...
Conference Paper
Full-text available
K-Means and Bisecting K-Means clustering algorithms need the optimal number into which the dataset may be divided. Spark implementations of these algorithms include a method that is used to calculate this number. Unfortunately, this measurement presents a lack of precision because it only takes into account a sum of intra-cluster distances misleadi...
Conference Paper
Full-text available
A forecasting algorithm for big data time series is presented in this work. A nearest neighbours-based strategy is adopted as the main core of the algorithm. A detailed explanation on how to adapt and implement the algorithm to handle big data is provided. Although some parts remain iterative, and consequently requires an enhanced implementation, e...
Conference Paper
This work proposes a methodology to identify genes highly related with cancer. In particular, a multi-objective evolutionary algorithm named CANGAR is applied to obtain quantitative association rules. This kind of rules are used to identify dependencies between genes and their expression levels. Hierarchical cluster analysis, fold-change and review...
Article
Full-text available
Association rule mining is a well-known methodology to discover significant and apparently hidden relations among attributes in a subspace of instances from datasets. Genetic algorithms have been extensively used to find interesting association rules. However, the rule-matching task of such techniques usually requires high computational and memory...
Article
Full-text available
This work proposes a novel methodology to improve the discovery of quantitative association rules in continuous datasets. This methodology comprises several evolutionary algorithms able to deal with real-valued variables without performing a static discretization process. Additionally, several quality measures are analysed to select the set of meas...
Conference Paper
Full-text available
There exist several fitness function proposals based on a combination of weighted objectives to optimize the discovery of association rules. Nevertheless, some differences in the measures used to assess the quality of association rules could be obtained according to the values of such weights. Therefore, in such proposals it is very important the u...
Article
In the last decade, the interest in microarray technology has exponentially increased due to its ability to monitor the expression of thousands of genes simultaneously. The reconstruction of gene association networks from gene expression profiles is a relevant task and several statistical techniques have been proposed to build them. The problem lie...
Article
Full-text available
There exist several fitness function proposals based on a combination of weighted objectives to optimize the discovery of association rules. Nevertheless, some differences in the measures used to assess the quality of association rules could be obtained according to the values of such weights. Therefore, in such proposals it is very important the u...
Article
In this paper we propose an evolutionary method of association rules discovery (EQAR, Evolutionary Quantitative Association Rules) that extends a recently published algorithm by the authors and we describe its application to a problem of Total Ozone Content (TOC) modeling in the Iberian Peninsula. We use TOC data from the Total Ozone Mapping Spectr...
Article
Full-text available
Many approaches are currently devoted to find DNA motifs in nucleotide sequences. However, this task remains challenging for specialists nowadays due to the difficulties they find to deeply understand gene regulatory mechanisms, especially when analyzing binding sites in DNA. These sites or specific nucleotide sequences are known to be responsible...
Article
Full-text available
The microarray technique is able to monitor the change in concentration of RNA in thousands of genes simultaneously. The interest in this technique has grown exponentially in recent years and the difficulties in analyzing data from such experiments, which are characterized by the high number of genes to be analyzed in relation to the low number of...
Article
Full-text available
An evolutionary approach for finding existing relationships among several variables of a multidimensional time series is presented in this work. The proposed model to discover these relationships is based on quantitative association rules. This algorithm, called QARGA (Quantitative Association Rules by Genetic Algorithm), uses a particular codifica...
Conference Paper
Full-text available
This paper presents the analysis of relationships among different interestingness measures of quality of association rules as first step to select the best objectives in order to develop a multi-objective algorithm. For this purpose, the discovering of association rules is based on evolutionary techniques. Specifically, a genetic algorithm has been...
Conference Paper
Full-text available
The microarray technique is able to monitor the change in concentration of RNA in thousands of genes simultaneously. The interest in this technique has grown exponentially in recent years and the difficulties in analyzing data from such experiments, which are characterized by the high number of genes to be analyzed in relation to the low number of...
Article
Full-text available
This research presents the mining of quantitative association rules based on evolutionary computation techniques. First, a real-coded genetic algorithm that extends the well-known binary-coded CHC algorithm has been projected to determine the intervals that define the rules without needing to discretize the attributes. The proposed algorithm is eva...
Conference Paper
Full-text available
This work presents the discovering of association rules based on evolutionary techniques in order to obtain relationships among correlated time series. For this purpose, a genetic algorithm has been proposed to determine the intervals that form the rules without discretizing the attributes and allowing the overlapping of the regions covered by the...

Network

Cited By