Jose Miguel-Alonso

Jose Miguel-Alonso
University of the Basque Country | UPV/EHU · Computers Architecture and Technology

PhD in Computer Science

About

99
Publications
36,288
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,892
Citations
Introduction
Full profesor at the Department of Computer Architecture and Technology of the UPV/EHU. Working on parallel and distributed systems with focus on high-performance computing and cloud computing.
Additional affiliations
October 1999 - September 2000
Purdue University West Lafayette
November 1989 - present
University of the Basque Country

Publications

Publications (99)
Article
This paper studies the influence that contiguous job placement has on the performance of schedulers for large-scale computing systems. In contrast with non-contiguous strategies, contiguous partitioning enables the exploitation of communication locality in applications, and also reduces inter-application interference. However, contiguous partitioni...
Article
The high performance computing landscape is shifting from collections of homogeneous nodes towards heterogeneous systems, in which nodes consist of a combination of traditional out-of-order execution cores and accelerator devices. Accelerators, built around GPUs, many-core chips, FPGAs or DSPs, are used to offload compute-intensive tasks. The adven...
Article
Full-text available
Cloud computing environments allow customers to dynamically scale their applications. The key problem is how to lease the right amount of resources, on a pay-as-you-go basis. Application re-dimensioning can be implemented effortlessly, adapting the resources assigned to the application to the incoming user demand. However, the identification of the...
Article
Full-text available
Many current parallel computers are built around a torus interconnection network. Machines from Cray, HP, and IBM, among others, make use of this topology. In terms of topological advantages, square (2D) or cubic (3D) tori would be the topologies of choice. However, for different practical reasons, 2D and 3D tori with different number of nodes per...
Preprint
Full-text available
Security in IoT systems is extremely important, as an intrusion into an IoT device or network can affect not only our domestic lives, but also industrial assets, with the potential to cause enormous damage. We discuss IoT security issues as defined by the OWASP Foundation, focusing on network related aspects. After a brief description of SDN in gen...
Article
As in other cybersecurity areas, machine learning (ML) techniques have emerged as a promising solution to detect Android malware. In this sense, many proposals employing a variety of algorithms and feature sets have been presented to date, often reporting impresive detection performances. However, the lack of reproducibility and the absence of a st...
Article
Full-text available
Top-of-rack switches based on photonic switching fabrics (PSF) could provide higher bandwidth and energy efficiency for datacenters (DC) and high-performance computers (HPC) than these with traditional electronic crossbars. However, because of their bufferless nature, PFS are affected by contention much more drastically than traditional packet-swit...
Article
Full-text available
Object detection is an essential capability for performing complex tasks in robotic applications. Today, deep learning (DL) approaches are the basis of state-of-the-art solutions in computer vision, where they provide very high accuracy albeit with high computational costs. Due to the physical limitations of robotic platforms, embedded devices are...
Preprint
Full-text available
As in other cybersecurity areas, machine learning (ML) techniques have emerged as a promising solution to detect Android malware. In this sense, many proposals employing a variety of algorithms and feature sets have been presented to date, often reporting impresive detection performances. However, the lack of reproducibility and the absence of a st...
Article
Full-text available
OpenFlow is a network device management and monitoring protocol that has enabled research, experimentation and implementation of software-defined networks in general, and data center networks in particular. In this review, after describing the particularities of data center networks and OpenFlow technology, a collection of recent research papers in...
Article
Full-text available
The Software Defined Networking (SDN) paradigm enables the development of systems that centrally monitor and manage network traffic, providing support for the deployment of machine learning-based systems that automatically detect and mitigate network intrusions. This paper presents an intelligent system capable of deciding which countermeasures to...
Article
The identification of network attacks which target information and communication systems has been a focus of the research community for years. Network intrusion detection is a complex problem which presents a diverse number of challenges. Many attacks currently remain undetected, while newer ones emerge due to the proliferation of connected devices...
Preprint
The identification of cyberattacks which target information and communication systems has been a focus of the research community for years. Network intrusion detection is a complex problem which presents a diverse number of challenges. Many attacks currently remain undetected, while newer ones emerge due to the proliferation of connected devices an...
Article
Full-text available
Cloud infrastructures provide computing resources to applications in the form of Virtual Machines (VMs). Many applications deployed in cloud resources have an elastic behavior, that is, they change the number of servers (VMs) dynamically, adapting the application to the workload. Scaling-out and scaling-in operations are managed by an auto-scaler m...
Conference Paper
Future exascale supercomputers will be composed of thousands of nodes. In those massive systems, the search for physically close nodes will become essential to deliver an optimal environment to execute parallel applications. Schedulers manage those resources, shared by many users and jobs, searching for partitions in which jobs will run. Significan...
Article
Full-text available
Kernel density estimation (KDE) is a popular technique used to estimate the probability density function of a random variable. KDE is considered a fundamental data smoothing algorithm, and it is a common building block in many scientific applications. In a previous work we presented S-KDE, an efficient algorithmic approach to compute KDE that outpe...
Article
Peer-to-Peer systems have been introduced as an alternative to the traditional client-server scheme. Distributed Hash Tables, a type of structured Peer-to-Peer system, have been designed for massive storage purposes. In this work we model the behavior of a DHT based system, Cassandra, with focus on its fault tolerance capabilities, and more specifi...
Article
In a High-Throughput Computing (HTC) system, system failures and churning pose an important performance limitation. The time used by tasks running in a node that suddenly fails (or abandons the system) constitutes a waste of resources. These aborted tasks are usually reinserted into the system for automatic re-execution, causing additional overhead...
Article
Kernel density estimation (KDE) is a statistical technique used to estimate the probability density function of a sample set with unknown density function. It is considered a fundamental data-smoothing problem for use with large datasets, and is widely applied in areas such as climatology and biometry. Due to the large volumes of data that these pr...
Article
We propose an extension to multiple dimensions of the univariate index of agreement between Probability Density Functions (PDFs) used in climate studies. We also provide a set of high-performance programs targeted both to single and multi-core processors. They compute multivariate PDFs by means of kernels, the optimal bandwidth using smoothed boots...
Article
In this paper, we propose and evaluate improved first fit (IFF), a fast implementation of the first fit contiguous partitioning strategy. It has been devised to accelerate the process of finding contiguous partitions in space-shared parallel computers in which the nodes are arranged forming multidimensional cubic networks. IFF uses system status in...
Article
Full-text available
Cloud infrastructures are designed to simultaneously service many, diverse applications that consist of collections of Virtual Machines (VMs). The placement policy used to map applications onto physical servers has important effects in terms of application performance and resource efficiency. We propose enhancing placement policies with network-awa...
Article
Non-contiguous partitioning strategies are often used to select and assign a set of nodes of a parallel computer to a particular job. The main advantage of these strategies, compared to contiguous ones, is the reduction of system fragmentation. However, without contiguity, locality in communications cannot be easily exploited, resulting in longer j...
Conference Paper
Full-text available
Cloud infrastructures are designed to simultaneously service many, diverse applications that consist of collections of Virtual Machines (VMs). The policy used to map applications onto physical servers (placement policy) has important effects in terms of application performance and resource efficiency. This paper proposes enhancing placement policie...
Article
Full-text available
SpiNNaker is a biologically-inspired massively-parallel computer designed to model up to a billion spiking neurons in real-time. A full-fledged implementation of a SpiNNaker system will comprise more than 105 integrated circuits (half of which are SDRAMs and half multi-core systems-on-chip). Given this scale, it is unavoidable that some components...
Article
In this work, we present a proposal to build a high throughput computing system totally based upon the Peer-to-Peer (P2P) paradigm. We discuss the general characteristics of P2P systems, with focus on P2P storage, and the expected characteristics of the HTC system: totally decentralized, not requiring permanent connection, and able to implement sch...
Conference Paper
Full-text available
Cloud computing environments offer the user the capability of running their applications in an elastic manner, using only the resources they need, and paying for what they use. However, to take advantage of this flexibility, it is advisable to use an auto-scaling technique that adjusts the resources to the incoming workload, both reducing the over-...
Conference Paper
Full-text available
SpiNNaker is a custom-made architecture designed to model large-scale spiking neural nets. One of the most significant characteristics of neural nets is their extreme communication needs; each neuron propagates its activation to thousands of other neurons. This paper shows analytical proof that the novel multicast router in SpiNNaker is a better so...
Article
Full-text available
Neural networks present a fundamentally different model of computation from the conventional sequential digital model, for which conventional hardware is typically poorly matched. However, a combination of model and scalability limitations has meant that neither dedicated neural chips nor FPGA's have offered an entirely satisfactory solution. SpiNN...
Article
Full-text available
The optimal mapping of tasks of a parallel program onto nodes of a parallel computing system has a remarkable impact on application performance. In this paper we propose an optimization framework to solve the mapping problem, which takes into account the communication matrix of the application and a cost matrix that depends on the topology of the p...
Article
Full-text available
Configuring a million-core parallel system at boot time is a difficult process when the system has neither spe-cialised hardware support for the configuration process nor a preconfigured default state that puts it in operating condition. SpiNNaker is a parallel Chip Multiprocessor (CMP) system for neural network (NN) simulation. Where most large CM...
Article
Full-text available
Interconnection networks arranged as k-ary n-trees or spines are widely used to build high-performance computing clusters. Current blade-based technology allows the integration of the first level of the network together with the compute elements. The remaining network stages require dedicated rack space. In most systems one or several racks house t...
Article
Full-text available
This paper describes INSEE, a simulation framework developed at the University of the Basque Country. INSEE is designed to carry out performance-related studies of interconnection networks. It is composed of two main modules: a Functional Simulator of Interconnection Networks (FSIN) and a TRaffic GENeration module (TrGen), together with several oth...
Article
Current consumer-grade computers and game devices incorporate very powerful processors that can be used to accelerate many classes of scientific codes. In this paper we explore the ability of the Cell Broadband Engine to run two similar Estimation of Distribution Algorithms, one for the discrete domain and the other for the continuous domain. Start...
Conference Paper
Full-text available
The SpiNNaker system is a biologically-inspired massively parallel architecture of bespoke multi-core System-on-Chips. The aim of its design is to simulate up to a billion spiking neurons in (biological) real-time. Packets, in SpiNNaker, represent neural spikes and these travel through the two-dimensional triangular torus network that connects the...
Article
Full-text available
Interconnection networks based on the k-ary n-tree topology are widely used in high-performance parallel computers. However, this topology is expensive and complex to build. In this paper we evaluate an alternative tree-like topology that is cheaper in terms of cost and complexity because it uses fewer switches and links. This alternative topology...
Chapter
Estimation of Distribution Algorithms (EDAs) are a set of techniques that belong to the field of Evolutionary Computation. They are similar to Genetic Algorithms (GAs), in the sense that, given a problem, they use a population of individuals to represent solutions, and this population is made to evolve towards the most promising solutions. However,...
Conference Paper
The optimal mapping of tasks of a parallel program onto nodes of a parallel computing system has a remarkable impact on application performance. We propose a new criterion to solve the mapping problem in 2D and 3D meshes that uses the communication matrix of the application and a cost matrix that depends on the system topology.We test via simulatio...
Article
Full-text available
Resumen En este artículo hacemos un resumen del primer contacto que hemos tenido con la computación GPGPU. Se ha portado parte de un programa Fortran a CUDA de NVIDIA. Para ello se ha realizado un profiling de la aplicación serie para detectar las partes con mayor coste computacional, que serán el objetivo a paralelizar. Se describen las transforma...
Article
Full-text available
In this paper we discuss environments for the full-system simulation of multicomputers. These environments are composed of a large collection of modules that simulate the compute nodes and the network, plus additional linking elements that perform communication and synchronization. We present our own environment, in which we integrate Simics with I...
Conference Paper
Full-text available
Evaluation of high performance parallel systems is a delicate issue, due to the difficulty of generating workloads that represent, those that will run on actual systems. We overview the most usual workloads for performance evaluation purposes, in the scope of interconnection networks simulation. Aiming to fill the gap between purely synthetic and a...
Conference Paper
Full-text available
Current consumer-grade computers and game devices incor- porate very powerful processors that can be used to acceler- ate many classes of scientific codes. However, programming multi-core chips, hybrid multi-processors or graphical pro- cessing units is not an easy task for those programmers that deal mainly with sequential codes. In this paper, we...
Conference Paper
Full-text available
ABSTRACT SpiNNaker is a massively parallel architecture designed to model large-scale spiking neural networks in (biological)real-time. Its design is based around ad�hocmulti-core System-on-Chips which are interconnected using a two-dimensional toroidal triangular mesh. Neurons are modeled,in software and their spikes generate packets,that,propagat...
Conference Paper
Full-text available
Configuring a million-core parallel system at boot time is a difficult process when the system has neither specialised hardware support for the configuration process nor a preconfigured default state that puts it in operating condition. SpiNNaker is a parallel Chip Multiprocessor (CMP) system for neural network (NN) simulation. Where most large CMP...
Conference Paper
Full-text available
This paper studies the influence that job placement may have on scheduling performance, in the context of massively parallel computing systems. A simulation-based performance study is carried out, using workloads extracted from real systems logs. The starting point is a parallel system built around a k-ary n-tree network and using well-known schedu...
Article
Full-text available
This paper addresses the utilization of traces taken from MPI applications to do simulation-based performance studies of parallel computing systems. Different mechanisms to capture traces are discussed, pointing out important limitations of some of them. One of these limitations is the invisibility of message interchanges in collective operations,...
Conference Paper
Full-text available
this paper studies the influence that task placement may have on the performance of applications, mainly due to the relationship between communication locality and overhead. This impact is studied for torus and fat-tree topologies. A simulation-based performance study is carried out, using traces of applications and application kernels, to measure...
Conference Paper
Full-text available
Evaluation of high performance parallel systems is a delicate issue, due to the difficulty of generating workloads that represent, with fidelity, those that will run on actual systems. In this paper we make an overview of the most usual methodologies used to generate workloads for performance evaluation purposes, focusing on the network: random tra...
Article
Full-text available
Interconnection networks in current parallel systems do not only increase in size; their buffer capacity and number of source ports have increased as well. All these factors result in a significant rise of network congestion compared with their predecessors. Consequently, packet injection must be restricted in order to prevent throughput degradatio...
Conference Paper
Full-text available
Any simulation-based evaluation of an interconnection network proposal requires a good characterization of the workload. Synthetic traffic patterns based on independent traffic sources are commonly used to measure performance in terms of average latency and peak throughput. As they do not capture the level of self-throttling that occurs in most par...
Article
Full-text available
The SpiNNaker massively parallel GALS system has been designed to support large-scale simulations of bio-logically inspired neural networks in real-time. The system is built around the chip-multiprocessor (CMP) technology using low-power ARM processors with an asynchronous network-on-chip (NoC) to support high performance parallel distributed proce...
Conference Paper
Full-text available
In this work we discuss a range of approaches to full-system simulation of distributed memory parallel computers, with emphasis on the interconnection network. We present our environment, based on Simics®, and discuss how unforeseen interactions and fine tuning of components can affect results.
Conference Paper
Full-text available
In this paper we show the difficulties encountered when performing full system simulation of a distributed memory parallel system. To illustrate the problem, we have chosen a workbench that evaluates the impact on application performance of some simple congestion-control mechanism that can be implemented in the interconnection network. Applications...
Conference Paper
Full-text available
Many parallel computers use Tori interconnection networks. Machines from Cray, HP and IBM, among others, exploit these topologies. In order to maintain full network symmetry, 2D and 3D Tori must have the same number of nodes (k) per dimension resulting in square or cubic topologies. Nevertheless, for practical reasons, computer engineers have desig...
Conference Paper
Estimation of Distribution Algorithms (EDAs) are a set of optimization techniques that have been successfully applied to different kinds of problems. In this paper, we deal with the creation of multivariate calibration models in quantitative chemistry. For this purpose, we use parallel implementations of two EDAs (EBNABIC and UMDA), using different...
Article
Full-text available
This paper describes the application of a collection of data mining methods to solve a calibration problem in a quantitative chemistry environment. Experimental data obtained from reactions which involve known concentrations of two or more components are used to calibrate a model that, later, will be used to predict the (unknown) concentrations of...
Conference Paper
Full-text available
Recent parallel systems use multiple injection ports and various injection policies, but little is known about their impact on network performance. This paper evaluates the influence that these injection interfaces have on maximum sustained throughput in adaptive cut-through torus networks by modeling the number of injection queues (1 or 4), and th...
Article
Full-text available
This paper presents, discusses and evaluates parallel implementations of a set of algorithms designed for optimization tasks: Estimation of Bayesian Network Algorithms (EBNAs). These algorithms belong to the family of Evolutionary Computation. Two different APIs have been combined: message passing and threads, with the aim of obtaining good perform...
Article
Full-text available
La simulación de redes de interconexión para sistemas paralelos y distribuidos requiere de mecanismos para la generación de tráfico. Este tráfico puede ser sintético, extraído de trazas, u obtenido a partir de aplicaciones paralelas en ejecución. En este trabajo describimos cómo, en el contexto de TrGen (el módulo de generación de tráfico de INSEE...
Conference Paper
Full-text available
Many simulation-based performance studies of interconnection net- works are carried out using synthetic workloads under the assumption of inde- pendent traffic sources. We show that this assumption, although useful for some traffic patterns, can lead to deceptive performance results for loads beyond satu- ration. Network throughput varies so much a...
Article
This paper proposes new parallel versions of some estimation of distribution algorithms (EDAs). Focus is on maintenance of the behavior of sequential EDAs that use probabilistic graphical models (Bayesian networks and Gaussian networks), implementing a master–slave workload distribution for the most computationally intensive phases: learning the pr...
Conference Paper
Full-text available
In this paper we introduce INSEE, an environment to help in the design of interconnection networks for parallel computing systems. It contains two main modules: a system to generate traffic (TrGen) and a lightweight functional simulator (FSIN). Additionally, external tools can be integrated into the environment. Examples are SICOSYS (a sophisticate...
Conference Paper
Full-text available
In this paper we introduce TrGen, a traffic generation environment specifically designed to interact with simulators of interconnection networks for parallel and distributed systems. This environment is able to generate synthetic traffic, and actual traffic taken from traces (of previous program runs). It can also cooperate with complete-system sim...
Article
Full-text available
The Java language specification states that every access to an array needs to be within the bounds of that array; i.e. between 0 and length - 1. Different techniques for different programming languages have been proposed to eliminate explicit bounds checks. Some of these techniques are implemented in off-the-self Java Virtual Machines (JVMs). The u...
Conference Paper
Full-text available
In this paper we introduce INSEE, an environment to help in the design of interconnection networks for parallel computing systems. It contains two main modules: a system to generate traffic (TrGen) and a lightweight functional simulator (FSIN). Additionally, external tools can be integrated into the environment. Examples are SICOSYS (a sophisticate...
Conference Paper
Full-text available
This paper describes the application of several data mining approaches to solve a calibration problem in a quantitative chemistry environment. Experimental data obtained from reactions which involve known concentrations of two or more components are used to calibrate a model that, later, will be used to predict the (unknown) concentrations of those...
Conference Paper
Full-text available
This paper studies the effect that HOL (Head-of-Line) blocki ng in the packet injection queue has on the performance of bidirecti onal k-ary n- cubes, for values of k over a certain threshold (around 20). The HOL blocking causes an unbalanced use of the channels corresponding to the two directions of bidirectional links, which is responsible for a...
Conference Paper
Full-text available
The dynamic nature of large-size Network Computing Systems (NCSs) and the varying monitoring demands from the end-users pose serious challenges for monitoring systems (MSs). A statically configured MS initially adjusted to perform optimally may end performing poorly. A reconfiguration mechanism for a distributed MS is proposed. It enables the MS to...
Conference Paper
Full-text available
The class of dense circulant graphs of degree four with optimal distance-related properties is analyzed in this paper. An algebraic study of this class is done. Two geometric characterizations are given, one in the plane and other in the space. Both characterizations facilitate the analysis of their topological properties and corroborate their suit...
Article
Full-text available
Many simulation-based performance studies of interconnection networks are carried out using synthetic workloads under the assumption of independent traffic sources. Although this assumption may be useful for some traffic patterns, it may lead to erroneous conclusions about the usefulness of some design proposals for heavy loads. In this work we sho...