
Alejandro CalderónUniversity Carlos III de Madrid, Leganés, Madrid, Spain · Computer Science and Engineering Department
Alejandro Calderón
PhD
About
81
Publications
11,402
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
872
Citations
Introduction
Additional affiliations
September 2000 - February 2017
University Carlos III de Madrid, Leganés, Madrid, Spain
Position
- Researcher + Professor (Assistant)
September 2000 - present
Publications
Publications (81)
In the last years, applications related to Artificial Intelligence and big data, among others, have been involved. There is a need to improve I/O operations to avoid bottlenecks in accessing a larger amount of data. For this purpose, the Expand Ad-Hoc parallel file system is being designed and developed.
Since these applications have very long exe...
Los IoT son dispositivos interconectados a través de una red, que pueden interactuar sin la intervención humana. Estos dispositivos podrían ser desde sensores hasta objetos cotidianos, como un frigorífico o coches, etc. Estos dispositivos han ido evolucionando a lo largo del tiempo para tener una mayor capacidad de cómputo y almacenamiento. Esta ev...
La enseñanza del lenguaje ensamblador es muy importante ya que permite a los estudiantes entender diferentes aspectos de la arquitectura interna de un ordenador. Habitualmente se usan simuladores de lenguaje ensamblador para facilitar la puesta en marcha del entorno de trabajo. Sin embargo, los simuladores habituales no permiten ejecutar estos prog...
En los últimos años ha habido una evolución en las necesidades de algunas aplicaciones, como las aplicaciones del área de Inteligencia Artificial y el big data, donde se precisa mejorar las operaciones de E/S para evitar cuellos de botella en el acceso a gran cantidad de datos.
Para ello, se está diseñando y desarrollando el sistema de ficheros pa...
This work introduces a new integrated development environment called CREATOR (didaCtic and geneRic assEmbly progrAmming simulaTOR) that provides an interactive platform for educational assembly programming. CREATOR is specially designed for the academic environment, so it is very intuitive to users who are encountering assembly for the first time....
Durante los últimos años las aplicaciones utilizadas en el campo de la ciencia están evolucionando hacia el análisis masivo de datos a través de workflows debido al crecimiento de áreas como la Inteligencia Artificial y el big data.
Sin embargo, el mayor cuello de botella cuando se ejecutan este tipo de aplicaciones se encuentra en las operaciones...
Actualmente existen multitud de simuladores de lenguaje ensamblador que permiten a los estudiantes ver y comprender cómo ejecutan los programas ensamblador. Sin embargo, estos simuladores ocultan a los estudiantes todas las implicaciones en términos de rendimiento, consumo de memoria o consumo de energía que conlleva ejecutar estos programas sobre...
El continuo aumento de uso de sensores y dispositivos inteligentes para aplicaciones domésticas en los últimos años ha provocado la implantación de infraestructuras y entornos capaces de gestionar, procesar y almacenar toda la información generada en un corto periodo de tiempo.
Estos entornos tienen que tratar con la problemática de un entorno camb...
An Ad-Hoc File System dynamically virtualizes storage on compute nodes into a fast storage volume to reduce congestion on parallel file systems used as backends in HPC environments and improve data locality.
This paper presents Expand Ad-Hoc, a version of the Expand parallel file system, for use as an Ad-Hoc storage system for HPC environments. Su...
Un sistema de ficheros ad-hoc virtualiza dinámicamente el almacenamiento en los nodos de cómputo en un volumen de almacenamiento rápido que permite reducir la congestión en los sistemas de ficheros paralelos utilizados como backend en los entornos HPC y mejorar la localidad de los datos. En este trabajo se presenta Expand Ad-Hoc, una versión del si...
En este artículo se presenta CREATOR, un simulador para la programación en ensamblador desarrollado por el grupo ARCOS de la UC3M. Este simulador permite definir la sintaxis y el funcionamiento de cualquier juego de instrucciones así como el convenio de paso de parámetros utilizado. Una vez definido cada juego de instrucciones en particular (MIPS32...
This paper proposes a new version of the power of two choices, SQ(d), load balancing algorithm. This new algorithm improves the performance of the classical model based on the power of two choices randomized load balancing. This model considers jobs that arrive at a dispatcher as a Poisson stream of rate \(\lambda n,\)\(\lambda < 1,\) at a set of n...
The ever-increasing growth of data centres and fog resources makes difficult for current simulation frameworks to model large computing infrastructures. Therefore, a major trade-off for simulators is the balance between abstraction level of the models, the scalability, and the performance of the executions. In order to balance better these, early f...
Our educational project has three primary goals. First, we want to provide a robust vision of how hardware and software interplay, by integrating the design of an instruction set (through microprogramming) and using that instruction set for assembly programming. Second, we wish to offer a versatile and interactive tool where the previous integrated...
Mobile cloud computing is a paradigm that delivers applications to mobile devices by using cloud computing. In this way, mobile cloud computing allows for a rich user experience; since client applications run remotely in the cloud infrastructure, applications use fewer resources in the user’s mobile devices. In this paper, we present a new mobile c...
Volunteer Computing is a type of distributed computing in which ordinary people donate their idle computer time to science projects like SETI@Home, Climateprediction.net and many others. In a similar way, Desktop Grid Computing is a form of distributed computing in which an organization uses its existing computers to handle its own long-running com...
Volunteer computing is a type of distributed computing in which ordinary people donate computing resources to scientific projects. BOINC is the main middleware system for this type of distributed computing. The aim of volunteer computing is that organizations be able to attain large computing power thanks to the participation of volunteer clients i...
Volunteer computing is a type of distributed computing in which ordinary people donate processing and storage resources to scientific projects. BOINC is the main middleware system for this type of computing. The aim of volunteer computing is that organizations be able to attain large computing power thanks to the participation of volunteer clients...
Massively parallel architectures are mainly based on a parallel heterogeneous setup. They are composed by different computing devices that speed up specific code regions, named kernels. These kernels are usually executed offline in the corresponding devices. Porting applications to a specific heterogeneous platform is a costly task in terms of time...
The objective of data compression is to avoid redundancy in order to reduce the size of the data to be stored or transmitted. In some scenarios, data compression may help to increase global performance by reducing the amount of data at a competitive cost in terms of global time and energy consumption. We have introduced computational compression as...
Lo que diferencia WepSIM(1) de otros simuladores usados en
la enseñanza de Estructura de Computadores está en tres aspectos importantes.
Primero, ofrece una visión integrada de la microprogramación
y de la programación en ensamblador, dando la posibilidad de trabajar con distintos juegos de instrucciones.
Segundo, permite al estudiante una mayor m...
The philosophy behind grid is to use idle resources to achieve a higher level of computational services (computation, storage, etc). Existing data grids solutions are based in new servers, specific APIs and protocols, however this approach is not a realistic solution for enterprises and universities, because this supposes the deployment of new data...
The MPI forum is actively working for a better MPI standard. The results are the new version 3 of the MPI standard, and the efforts for the incoming MPI 3.1/4.0. The technological changes provide many opportunities for improvements and new ideas. This paper introduces two main contributions in this direction: (1) how to improve the MPI_Info object...
In this paper we present several advances for managing caching resources in multimedia streaming systems. The improvements are related to increase the performance of these resources. First, we use an analytical model to build a caching algorithm for streaming known streams. This algorithm is optimum if the bandwidth requirements are constant over t...
Traditional approaches for storage devices simulation have been based on detailed and analytic models. However, analytic models are difficult to obtain and detailed models require a high computational cost which may be not affordable for large scale simulations (e.g. detailed data center simulations). In current systems like large clusters, grids,...
Nowadays, high performance computing is being improved thanks to different platforms like clusters, grids, and volunteer computing environments. Volunteer computing is a type of distributed computing paradigm in which a large number of computers, volunteered by members of the general public, provide computing and storage resources for the execution...
This work presents an optimization of MPI communications, called Dynamic-CoMPI, which uses two techniques in order to reduce the impact of communications and non-contiguous I/O requests in parallel applications.
These techniques are independent of the application and complementaries to each other. The first technique is an optimization
of the Two-P...
Traditional approaches for storage devices simulation have been based on detailed analytical models. However, detailed models require detailed computations which may be not affordable for large scale simulations. Moreover, highly detailed models cannot be easily generalized. A different approach is the black-box statistical modeling, where the stor...
Precision agriculture is a field which provides one of the most suitable scenarios for the deployment of wireless sensor networks (WSNs). The particular characteristics of agricultural environments – which may vary significantly with location – make WSNs a key technology able to provide accurate knowledge to farmers. This knowledge represents a val...
This paper presents an optimization of MPI communication, called Adaptive-CoMPI, based on runtime compression of MPI messages exchanged by applications. The technique developed can be used for any application, because its implementation is transparent for the user, and integrates different compression algorithms for both MPI collective and point-to...
In the last years the Wireless Sensor Networks' (WSN) technology has been increasingly employed in various application domains. The extensive use of WSN posed new challenges in terms of both scalability and reliability. This paper proposes Sensor Node File System (SENFIS), a novel file system for sensor nodes, which addresses both scalability and r...
Data replication is a practical and effective method to achieve efficient and fault-tolerant data access in grids. Traditionally, data replication schemes maintain an entire replica in each site where a file is replicated, providing a read-only model. These solutions require huge storage resources to store the whole set of replicas and do not allow...
This paper presents an optimization of MPI communications, called CoMPI, based on run-time compression of MPI messages exchanged by applications. A broad number of compression algorithms have been
fully implemented and tested for both MPI collective and point to point primitives. In addition, this paper presents a study
of several compression algor...
Parallelism in file systems is obtained by using several independent server nodes supporting one or more secondary storage
devices. This approach increases the performance and scalability of the system, but a fault in one single node can stop the
whole system. To avoid this problem, data must be stored using some kind of redundant technique, so any...
Data management is one of the most important problems in grid environments. One important challenge facing grid computing is the design of a grid file system. The Global Grid Forum defines a grid file system as a human-readable resource namespace for management of heterogeneous distributed data resources, that can span across multiple autonomous ad...
Parallelism in file systems is obtained by using several independent server nodes supporting one or more secondary storage
devices. This approach increases the performance and scalability of the system, but a fault in one single node can make the
whole system fail. In order to avoid this problem, data must be stored using some kind of redundant tec...
This paper presents Multiple-Phase Collective I/O, a novel collective I/O technique for distributed memory mul- tiprocessors. Multiple-Phase Collective I/O is a refinement of two-phase collective I/O technique. The communication phase is structured into several steps, which progressively increase the locality of the data to be written to a file sys...
Data management is one of the most important problems in grid environments. Most of the efforts in data management in grids have been focused on data replication. Data replication is a practical and effective method to achieve efficient data access in grids. However all data replication schemes lack in providing a grid file system. One important ch...
Traditionally, distributed Web servers have used two strategies for allocating files on server nodes: full replication and full distribution. While full replication provides a highly reliable solution, it limits storage capacity to the capacity of the smallest node. On the other hand, full distribution provides higher storage capacity at the cost o...
Clusters are the most common solutions for high performance computing at the present time. In this kind of systems, an important challenge is the I/O subsystem design. Typically, these environments are not flexible enough and the only way to solve performance bottlenecks is adding new hardware. In this paper, we show how an I/O proxy-based architec...
Traditionally the alternatives for Web content storage have been full replication and full distribution. More recently partial
replication has been proposed as an hybrid strategy. This paper shows a quantitative justification to advantages achieved
by using this approach in terms of storage capacity usage and reliability. Our analytical study prove...
Traditionally, distributed Web servers have used two strategies for allocating files on server nodes: full replication and full distribution. While full replication provides a highly reliable solution, it limits storage capacity to the capacity of the smallest node. On the other hand, full distribution provides higher storage capacity at the cost o...
Traditionally, distributed Web servers have used two strategies for allocating files on server nodes: full repli-cation and full distribution. While full replication provides a highly reliable solution, it limits storage capacity to the capacity of the smallest node. On the other hand, full distribution provides higher storage capacity at the cost...
This paper 1 proposes a new disk scheduling algorithm for a storage virtualization schema, decoupling virtual disks and physical disks. It allows the system to virtualize not only the storage capacity, but also the storage bandwidth, following QoS directives. That virtualization can be applied to the applications bandwidth and access time requireme...
El modelo de computación Grid ha evolucionado en los últimos años para proporcionar un entorno de computación de altas prestaciones en redes de área amplia. Sin embargo, uno de los mayores problemas se encuentra en las aplicaciones que hacen uso intensivo y masivo de datos. Como solución a los problemas de estas aplicaciones se ha utilizado la repl...
Currently there is a growing interest in using Java for high performance computing. Java has many advantages for high performance computing: it is based on a high-level and object-oriented programming model with support for multithreading and distributed computing. Furthermore, Java 's virtual machine allows applications to run on multiple heteroge...
Parallelism in file systems is obtained by using several independent server nodes supporting one or more secondary storage devices. This approach increases the performance and scalability of the system, but a fault in one single node can stop the whole system. To avoid this problem, data must be stored using some kind of redundant technique, so any...
In this paper we present an implementation of a computer-based railway information system. This system requires a distributed infrastructure that should offer a high availability. The solution proposed is distributed into three kind of nodes: a node that implements the system interface for administering and monitoring using an ubiquitous Web applic...
In this paper a new multimedia caching strategy is proposed that includes several optimizations to the state of the art. This algorithm takes its roots from the interval caching algorithms but it evolves towards a more adaptive approaching that could obtain a better performance for variable bit-rate streams and serving media stored on multiple disk...
This paper presents main experiences on designing and implementing a prototype for remote monitoring and management of unattended remote train stations information system. The prototype has several types of information sources, for example proprietary host terminal, streaming voice, encoded files, database information, etc., in remote places with d...
Caching has been intensively used in memory and traditional file systems to improve system performance. However, the use of caching in parallel file systems and I/O libraries has been limited to I/O nodes to avoid cache coherence problems. We specify an adaptive cache coherence protocol that is very suitable for parallel file systems and parallel I...
The usage of parallelism in file systems allows the achievement of high performance I/O in clusters and
networks of workstations. Traditionally this kind of solution was only available for UNIX systems,
requires the usage of special servers and the usage of special APIs, which leads to the modification, and/or
recompilation of existing applications...
The philosophy behind grid is to use idle resources to achieve a higher level of computational services (computation, storage, etc). Existing data grids solutions are based in new servers, specific APIs and protocols, however this approach is not a realistic solution for enterprises and universities, because this supposes the deployment of new data...
One important piece of system software for clusters is the parallel file system. Two main problems can be found in current parallel file systems and parallel I/O libraries for clusters. One is that they do not use standard servers, thus it is very difficult to use these systems in heterogeneous environments. Other is that with multiple servers runn...
In 2000, the European Union founded a project named ‘RAIL: Reliability centered maintenance approach for the infrastructure and logistics of railway operation’ aimed to study the application of Reliability centered maintenance (RCM) techniques to the railway infrastructure. In this paper, we present the results obtained into the RAIL project, inclu...
Distributed filesystems are a typical solution in networked environments as clusters and grids. Parallel filesystems are a typical solution in order to reach high performance I/O distributed environment, but those filesystems have some limitations in heterogeneous storage systems. Usually in distributed systems, load balancing is used as a solution...
This paper describes a new parallel file system, called Expand (Expandable Parallel File System)1, that is based on NFS servers. Expand allows the transparent use of multiple NFS servers as a single file system. The different
NFS servers are combined to create a distributed partition where files are declustered. Expand requires no changes to the
NF...
Every storage platform intended to fit the requirements of the multimedia systems must incorporate a disk scheduling mechanism and a cache architecture that can handle their special needs. On the other hand, the general trend is to make integrated storage platforms that meet the requirements of deterministic applications, multimedia systems, and tr...
This article describes an implementation of MPI-IO using a new parallel file system, called Expand (Expandable Parallel File System), which is based on NFS servers. Expand combines multiple NFS servers to create a distributed partition where files are striped. Expand requires no changes to the NFS server and uses RPC operations to provide parallel...
One important piece of system software for clusters is the parallel file system. All current parallel file systems and parallel I/O libraries for clusters do not use standard servers, thus it is very difficult to use these systems in heterogeneous environments. However why use proprietary or special-purpose servers on the server end of a parallel f...
During the last years, Internet video streaming has experiences a phenomenal growth. This is happening despite the notorious difficulties of transmitting data packets with a deadline over the Internet, due to variability in throughput, delays and losses. These problems arise significantly when using wireless networks where the available bandwidth i...
Nowadays, multimedia systems are evolving towards integrated storage platforms that meet the requirements of deterministic applications, multimedia systems, and traditional best-effort applications altogether These systems must incorporate a disk scheduling mechanism and a cache architecture that can handle the requirements of each kind of request...
This paper describes an implementation of MPI-IO using a new parallel file system, called Expand (Expandable Parallel File
System)1, that is based on NFS servers. Expand combines multiple NFS servers to create a distributed partition where files are declustered.
Expand requires no changes to the NFS server and uses RPC operations to provide paralle...
The paper describes new techniques to increase the performance of collective communication operations in clusters. These techniqnes are based in multithreading operations and on-line data compression. The techniques proposed have been implemented in MiMPI, a thread-safe implementation of MPI. We have evaluated, and compared, the performance of MiMP...