Jesus Carretero

Jesus Carretero
University Carlos III de Madrid | UC3M · Department of Computer Science and Engineering

PhD in Computer Science

About

434
Publications
95,304
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,905
Citations
Additional affiliations
January 2000 - present
University Carlos III de Madrid
Position
  • Professor (Full)
Education
January 1990 - June 1995
Universidad Politécnica de Madrid
Field of study
  • Computer Science & Engineering
September 1983 - July 1989
Universidad Politécnica de Madrid
Field of study
  • Computer Science & Engineering

Publications

Publications (434)
Conference Paper
Full-text available
In the last years, applications related to Artificial Intelligence and big data, among others, have been involved. There is a need to improve I/O operations to avoid bottlenecks in accessing a larger amount of data. For this purpose, the Expand Ad-Hoc parallel file system is being designed and developed. Since these applications have very long exe...
Conference Paper
Full-text available
En los últimos años ha habido una evolución en las necesidades de algunas aplicaciones, como las aplicaciones del área de Inteligencia Artificial y el big data, donde se precisa mejorar las operaciones de E/S para evitar cuellos de botella en el acceso a gran cantidad de datos. Para ello, se está diseñando y desarrollando el sistema de ficheros pa...
Article
Full-text available
This paper presents Kawak, a GIS-bigdata model for merging retrospective meteorological data with ground-based observations to achieve comprehensive territorial coverage. This model identifies areas without active ground-based stations and creates virtual stations for these areas. Kawak incorporates AI algorithms for conducting spatio-temporal stud...
Article
Background Collaborative comparisons and combinations of epidemic models are used as policy-relevant evidence during epidemic outbreaks. In the process of collecting multiple model projections, such collaborations may gain or lose relevant information. Typically, modellers contribute a probabilistic summary at each time-step. We compared this to di...
Article
Full-text available
Agent-based epidemiological simulators have been proven to be one of the most successful tools for the analysis of COVID-19 propagation. The ability of these tools to reproduce the behavior and interactions of each single individual leads to accurate and detailed results, which can be used to model fine-grained health-related policies like selectiv...
Article
Full-text available
The development of adaptive scheduling algorithms that take advantage of malleability has become a crucial area of research in many large-scale projects. Malleable workloads can improve the system’s performance but, at the same time, provide an extra dimension to the scheduling problem. This paper proposes an adaptive, performance-based job schedul...
Article
Full-text available
Introducción: Los modelos epidemiológicos han demostrado ser cruciales para apoyar la toma de decisiones de las autoridades sanitarias durante la pandemia de COVID-19, así como concienciar al público en general de las distintas medidas adoptadas por las autoridades (distanciamiento social, uso de mascarilla, vacunación, etc.). Objetivos: Describir...
Article
This paper presents a novel feature for improving the scheduling process based on the performance prediction and the detection of CPU and I/O interference between applications. This feature consists of using malleable synthetic benchmarks – called clones – that reproduce the behaviour of applications executed in a cluster. These proxies can be used...
Conference Paper
MeshStore, a fault-tolerant serverless storage model for edge-fog-cloud continuum systems, enables organizations to integrate distributed heterogeneous storage resources into a single unified storage service for the sharing of data through serverless functions deployed on edge-fog-cloud environments to create continuum dataflows. This unified servi...
Conference Paper
Full-text available
La creciente demanda de procesamiento de información por parte de las nuevas aplicaciones intensivas en datos está ejerciendo una alta presión sobre el rendimiento y la capacidad de los sistemas de almacenamiento de altas prestaciones. Por lo tanto, los nuevos avances en las tecnologías de almacenamiento tienen como objetivo satisfacer estas demand...
Conference Paper
Full-text available
Durante los últimos años las aplicaciones utilizadas en el campo de la ciencia están evolucionando hacia el análisis masivo de datos a través de workflows debido al crecimiento de áreas como la Inteligencia Artificial y el big data. Sin embargo, el mayor cuello de botella cuando se ejecutan este tipo de aplicaciones se encuentra en las operaciones...
Conference Paper
Full-text available
Hoy en día, la tuberculosis es una de las enfermedades más mortíferas. Sin embargo, se ha demostrado que un diagnóstico rápido tiene una gran influencia en el pronóstico y evolución de la enfermedad. El objetivo de este trabajo es la aceleración del tiempo de diagnóstico, así como la mejora de la sensibilidad de la microscopía de esputo como herram...
Preprint
Full-text available
The development of adaptive scheduling algorithms that take advantage of malleability has become a crucial area of research in many large-scale projects. Malleable workloads can improve the system's performance but at the same time provide an extra dimension to the scheduling problem. This paper proposes an adaptive, performance-based job schedulin...
Chapter
The high energy consumption of computing platforms has become one of the major problems in high-performance computing (HPC). Computer energy consumption represents a significant percentage of the CO2 emissions that occur each year in the world, therefore, it is crucial to develop energy efficiency techniques in order to reduce the energy consumptio...
Chapter
Advancement in storage technologies, such as NVMe and persistent memory, enables the acceleration of I/O operations in HPC systems. However, relying solely on ultra-fast storage devices is not cost-effective, leading to the need for multi-tier storage hierarchies to move data based on its usage. Ad-hoc file systems have been proposed as a solution,...
Chapter
The growing demands for data processing by new data-intensive applications are putting pressure on the performance and capacity of HPC storage systems. The advancement in storage technologies, such as NVMe and persistent memory, are aimed at meeting these demands. However, relying solely on ultra-fast storage devices is not cost-effective, leading...
Article
Cloud‐based services have proved useful in several research fields, such as engineering, health science, and astrophysics, to mention a few examples. The computational environmental science community developed a strong need for cloud facilities to store, process, and manage data from observations and numerical models for simulations and forecasts....
Poster
Full-text available
The growing demands for data processing by new data-intensive applications are putting pressure on the performance and capacity of HPC storage systems. The advancement in storage technologies, such as NVMe and persistent memory, are aimed at meeting these demands. However, relying solely on ultra-fast storage devices is not cost-effective, leading...
Preprint
Full-text available
Background. Collaborative comparisons and combinations of multiple epidemic models are used as policy-relevant evidence during epidemic outbreaks. Typically, each modeller summarises their own distribution of simulated trajectories using descriptive statistics at each modelled time step. We explored information losses compared to directly collectin...
Conference Paper
An Ad-Hoc File System dynamically virtualizes storage on compute nodes into a fast storage volume to reduce congestion on parallel file systems used as backends in HPC environments and improve data locality. This paper presents Expand Ad-Hoc, a version of the Expand parallel file system, for use as an Ad-Hoc storage system for HPC environments. Su...
Chapter
Full-text available
Nowadays, tuberculosis is one of the more deadly diseases. Nevertheless, an accurate and fast diagnosis has a great influence on disease prognosis. The research goal of this work is to speed up the time to diagnosis, as well as to improve the sensibility of sputum microscopy as a tuberculosis diagnosis tool. This work presents a novel deep learning...
Chapter
Abstract The current static usage model of HPC systems is becoming increasingly inefficient due to the continuously growing complexity of system architectures, combined with the increased usage of coupled applications, the need for strong scaling with extreme scale parallelism, and the increasing reliance on complex and dynamic workflows. Malleabil...
Article
Current high-performance interconnection networks for HPC and Data-Center systems incorporate mechanisms to prevent congestion from degrading network performance. Specifically, the popular InfiniBand specification defines a mechanism to reduce the injection rate of the traffic flows contributing to congestion. However, the efficiency of this mechan...
Article
This paper presents Xel, a cloud-agnostic data platform for the design-driven building of high-availability data science services as a support tool for data-driven decision-making. We designed and implemented Xel based on four main components: (a) a high level and driven-design framework for end-users to select analytic and machine learning tools f...
Article
Health data science systems are becoming key for supporting healthcare decision-making processes. However, these systems should achieve continuous data processing and adapt their behavior to changes arising in real scenarios. However, building this type of self-adaptable systems is not trivial, as it requires integrating data analytics and artifici...
Chapter
LIMITLESS is a lightweight and scalable framework that provides a holistic view of the system employing the combination of both platform and application monitoring. This paper presents a novel feature for improving the scheduling process based on the performance prediction and the detection of interference between real applications. This feature co...
Chapter
Computer applications are growing in terms of data management requirements. In both scientific and engineering domains, high-performance computing clusters tend to experience bottlenecks in the I/O layer, limiting the scalability of data-intensive based applications. Thus, minimizing the number of cycles required by I/O operations constitutes a wid...
Article
This paper presents a continuous delivery/continuous verifiability (CD/CV) method for IoT dataflows in edge–fog–cloud. A CD model based on extraction, transformation, and load (ETL) mechanism as well as a directed acyclic graph (DAG) construction, enable end-users to create efficient schemes for the continuous verification and validation of the exe...
Article
Full-text available
Virtualization has become one of the main tools for making efficient use of the resources offered by multicore embedded platforms. In recent years, even sectors such as space, aviation, and automotive, traditionally wary of adopting this type of technology due to the impact it could have on the safety of their systems, have been forced to introduce...
Article
Full-text available
Objective We analyse the impact of different vaccination strategies on the propagation of COVID-19 within the Madrid metropolitan area, starting on 27 December 2020 and ending in Summer of 2021. Materials and methods The predictions are based on simulation using EpiGraph, an agent-based COVID-19 simulator. We first summarise the different models i...
Article
Full-text available
Background Nowadays doctors and radiologists are overwhelmed with a huge amount of work. This led to the effort to design different Computer-Aided Diagnosis systems (CAD system), with the aim of accomplishing a faster and more accurate diagnosis. The current development of deep learning is a big opportunity for the development of new CADs. In this...
Conference Paper
Full-text available
Un sistema de ficheros ad-hoc virtualiza dinámicamente el almacenamiento en los nodos de cómputo en un volumen de almacenamiento rápido que permite reducir la congestión en los sistemas de ficheros paralelos utilizados como backend en los entornos HPC y mejorar la localidad de los datos. En este trabajo se presenta Expand Ad-Hoc, una versión del si...
Conference Paper
Full-text available
High-level programming models can help application developers to access and use resources without the need to manage low-level architectural entities, as a parallel programming model defines a set of programming abstractions that simplify the way by which a programmer structures and expresses her/his algorithm. Early proposals of Exascale programmi...
Article
Full-text available
Supply chains play today a crucial role in the success of a company’s logistics. In the last years, multiple investigations focus on incorporating new technologies to the supply chains, being Internet of Things (IoT) and blockchain two of the most recent and popular technologies applied. However, their usage has currently considerable challenges, s...
Chapter
The transmission of COVID-19 through a population depends on many factors which model, incorporate, and integrate many heterogeneous data sources. The work we describe in this paper focuses on the data management aspect of EpiGraph, a scalable agent-based virus-propagation simulator. We describe the data acquisition and pre-processing tasks that ar...
Article
This work presents LIMITLESS, a HPC framework that provides new strategies for monitoring clusters. LIMITLESS is a scalable light-weight monitor that is integrated with other HPC runtimes in order to obtain a holistic view of the system that combines both platform and application monitoring. This paper presents a description of the novel components...
Article
This paper presents the design, development, and evaluation of PuzzleMesh, an agnostic service mesh composition model to process large volumes of data in edge-fog-cloud environments. This model is based on a puzzle metaphor where pieces, puzzles, and metapuzzles represent self-contained autonomous and reusable software artifacts encapsulated into c...
Chapter
Today’s data-intensive applications require access to multiple types of storage platforms, such as parallel file systems, distributed file systems, and in-memory data systems. In addition, many applications are demanding the processing of data streams. The goal is to develop mechanisms to integrate and hide the diversity of data sources from applic...
Chapter
We present a study of the performance of the Weather Research and Forecasting [WRF] code under several hardware configurations in an HPC environment. The WRF code is a standard code for weather prediction, used in several fields of science and industry. The metrics used in this case are the execution time of the run and the energy consumption of th...
Conference Paper
Cloud storage has been the solution for organizations to manage the exponential growth of data observed over the past few years. However, end-users still suffer from side-effects of cloud service outages, which particularly affect edge-fog-cloud environments. This paper presents SeRSS, a storage mesh architecture to create and operate reliable, con...
Article
On November 25, 2021, the European Medicines Agency (EMA) authorized the presentation of Comirnaty vaccine (Pfizer-Biontech) for children between 5 and 11 years of age. In our country, this vaccination began on December 15, after it was approved by the Public Health Commission. A mathematical model has been developed to evaluate the possible impact...
Article
Full-text available
Data synchronization and content delivery services are key to supporting healthcare dataflows built by organizations. These types of services must prepare and process the data to accomplish mandatory non-functional requirements, such as security and reliability. This is a challenge as multiple applications, infrastructures, and platforms participat...
Article
Full-text available
This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorize...
Preprint
Full-text available
Background: This work analyses the impact of different vaccination strategies on the propagation of COVID-19 within the Madrid metropolitan area starting the 27th of December 2020 and ending in the Summer of 2021. The predictions are based on simulation using EpiGraph, an agent-based COVID-19 simulator. ● Methods: We briefly summarize the different...
Article
Full-text available
The main purpose of this work is to investigate and compare several deep learning enhanced techniques applied to X-ray and CT-scan medical images for the detection of COVID-19. In this paper, we used four powerful pre-trained CNN models, VGG16, DenseNet121, ResNet50,and ResNet152, for the COVID-19 CT-scan binary classification task. The proposed Fa...
Article
Full-text available
As long as critical levels of vaccination have not been reached to ensure heard immunity, and new SARS-CoV-2 strains are developing, the only realistic way to reduce the infection speed in a population is to track the infected individuals before they pass on the virus. Testing the population via sampling has shown good results in slowing the epidem...
Conference Paper
En la actualidad las cadenas de suministro son cruciales en las empresas para conseguir una buena logística. En los últimos años se han realizado diferentes investigaciones para incorporar las nuevas tecnologías en las cadenas de suministro, siendo el Internet de las Cosas (IoT) y el Blockchain dos de las más aplicadas. Sin embargo, presentan algun...
Article
This paper presents a novel transversal, agnostic-infrastructure, and generic processing model to build environmental big data services in the cloud. Transversality is used for building processing structures (PS) by reusing/coupling multiple existent software for processing environmental monitoring, climate, and earth observation data, even in exec...
Article
Full-text available
This paper proposes a new version of the power of two choices, SQ(d), load balancing algorithm. This new algorithm improves the performance of the classical model based on the power of two choices randomized load balancing. This model considers jobs that arrive at a dispatcher as a Poisson stream of rate \(\lambda n,\)\(\lambda < 1,\) at a set of n...
Article
Full-text available
This work presents simulation results for different mitigation and confinement scenarios for the propagation of COVID-19 in the metropolitan area of Madrid. These scenarios were implemented and tested using EpiGraph, an epidemic simulator which has been extended to simulate COVID-19 propagation. EpiGraph implements a social interaction model, which...
Conference Paper
In organizational environments, such as in hospitals, data have to be processed, preserved, and shared with other organizations in a cost-efficient manner. Moreover, organizations have to accomplish different mandatory non-functional requirements imposed by the laws, protocols, and norms of each country. In this context, this paper presents a Feder...
Article
Computational methods are nowadays ubiquitous in the field of bioinformatics and biomedicine. Besides established fields like molecular dynamics, genomics or neuroimaging, new emerging methods rely heavily on large scale computational resources. These new methods need to manage Tbytes or Pbytes of data with large-scale structural and functional rel...
Article
Full-text available
A complex and important task in the cloud resource management is the efficient allocation of virtual machines (VMs), or containers, in physical machines (PMs). The evaluation of VM placement techniques in real-world clouds can be tedious, complex and time-consuming. This situation has motivated an increasing use of cloud simulators that facilitate...
Article
Full-text available
The edge, the fog, the cloud, and even the end-user’s devices play a key role in the management of the health sensitive content/data lifecycle. However, the creation and management of solutions including multiple applications executed by multiple users in multiple environments (edge, the fog, and the cloud) to process multiple health repositories t...
Article
This paper presents a processing model for big IoT data. The model includes a continuous delivery scheme based on building blocks for constructing software pipelines from the edge to the cloud. It also includes a data preparation scheme based on parallel patterns for establishing, in an efficient manner, controls over the production and consumption...
Article
One of the latest trends in Computed Tomography (CT) is the reduction of the radiation dose delivered to patients through the decrease of the amount of acquired data. This reduction results in artifacts in the final images if conventional reconstruction methods are used, making it advisable to employ iterative algorithms to enhance image quality. M...