
Fernando GuiradoUniversity of Lleida | UDL · Department of Computer and Industrial Engineerings
Fernando Guirado
Ph. D Computer Science
About
52
Publications
6,050
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
484
Citations
Publications
Publications (52)
In Cloud Computing, the virtual machine scheduling in datacenters becomes challenging when trying to optimise user-service requirements and, at the same time, efficient resource management. Clumsy load management results in host overloads that trigger a continuous flow of virtual machine (VM) migrations to correct this situation, thus negatively im...
Background and objective:
The emergence of Next-Generation sequencing has created a push for faster and more accurate multiple sequence alignment tools. The growing number of sequences and their longer sizes, which require the use of increased system resources and produce less accurate results, are heavily challenging to these applications. Consis...
The rise of high-resolution and high-throughput sequencing technologies has driven the emergence of such new fields of application as precision medicine. However, this has also led to an increase in the storage and processing requirements for the bioinformatics tools, which can only be provided by high-performance and massive data processing infras...
Using virtualization, cloud environments satisfy dynamically the computational resource necessities of the user. The dynamic use of the resources determines the demand for working hosts. Through virtual machine (VM) migrations, datacenters perform load balancing to optimize the resource usage and solve saturation. In this work, a policy, named WPSP...
Using virtualization, cloud environments satisfy dynamically the computational resource necessities of the user. The dynamic use of the resources determines the demand of working hosts. Through virtual machine (VM) migrations, datacenters perform load balancing to optimise the resource usage and solve saturation. In this work, a policy is implement...
With the advent of new high-throughput next-generation sequencing technologies, the volume of genetic data processed has increased significantly. It is becoming essential for these applications to achieve large-scale alignments with thousands of sequences or even whole genomes. However, all current MSA tools have exhibited scalability issues when t...
Nowadays, cloud computing is a growing scenario applied to many scientific and manufacturing areas due to its flexibility for adapting to highly demanding computing requirements. The advantages of pay-as-you-go model, elasticity, and the flexibility and customization offered by virtualization make cloud computing an attractive option for meeting th...
Next-generation sequencing, also known as high-throughput sequencing, has increased the volume of genetic data processed by sequencers. In the bioinformatic scientific area, highly rated multiple sequence alignment tools, such as MAFFT, ProbCons, and T-Coffee (TC), use the probabilistic consistency as a prior step to the progressive alignment stage...
Large-scale data processing techniques, currently known as Big-Data, are used to manage the huge amount of data that are generated by sequencers. Although these techniques have significant advantages, few biological applications have adopted them. In the Bioinformatic scientific area, Multiple Sequence Alignment (MSA) tools are widely applied for e...
Reducing energy consumption in large-scale computing facilities has become a major concern in recent years. Most of the techniques have focused on determining the computing requirements based on load predictions and thus turning unnecessary nodes on and off. Nevertheless, once the available resources have been configured, new opportunities arise fo...
Scheduling and resource allocation to optimize performance criteria in multi-cluster heterogeneous environments is known as an NP-hard problem, not only for the resource heterogeneity, but also for the possibility of applying co-allocation to take advantage of idle resources across clusters. A common practice is to use basic heuristics to attempt t...
Large-scale federated environments have emerged to meet the requirements of increasingly demanding scientific applications. However, the seemingly unlimited availability of computing resources and heterogeneity turns the scheduling into an NP-hard problem. Unlike exhaustive algorithms and deterministic heuristics, evolutionary algorithms have been...
Multiple Sequence Alignment (MSA) is essential for a wide range of applications in Bioinformatics. Traditionally, the alignment accuracy was the main metric used to evaluate the goodness of MSA tools. However, with the growth of sequencing data, other features, such as performance and the capacity to align larger datasets, are gaining strength. To...
This short presents a prototype based on the PSysCal tool [1] to help ecologists in their daily tasks. One of these tasks consists of modeling the interaction between different species in complex ecosystems. Nowadays, the current tools that allow this task to be performed, such as PSysCal, require huge amounts of computational resources, and also a...
Multiple sequence alignment (MSA) is crucial for high-throughput next generation sequencing applications. Large-scale alignments with thousands of sequences are necessary for these applications. However, the quality of the alignment of current MSA tools decreases sharply when the number of sequences grows to several thousand. This accuracy degradat...
Multiple sequence alignment (MSA) is one of the most useful tools in bioinformatics. However, the growth of sequencing data imposes further difficulties for aligning it with traditional tools. For large-scale alignments with thousands of sequences it will be necessary to use and take profit of the high performance computing (HPC). This paper, focus...
Accuracy on multiple sequence alignments (MSA) is of great significance for such important biological applications as evolution and phylogenetic analysis, homology and domain structure prediction. In such analyses, alignment accuracy is crucial. In this paper, we investigate a combined scoring function capable of obtaining a good approximation to t...
The exploitation of throughput in a parallel application that processes an input data stream is a difficult challenge. For typical coarse-grain applications, where the computation time of tasks is greater than their communication time, the maximum achievable throughput is determined by the maximum task computation time. Thus, the improvement in thr...
Multi-cluster environments are composed of mul-tiple clusters of computers that act collaboratively, thus al-lowing computational problems that require more resources than those available in a single cluster to be treated. How-ever, the degree of complexity of the scheduling process is greatly increased by the heterogeneity of resources and the co-...
Multi-cluster environments are composed of multiple clusters that act collaboratively, thus allowing computational problems that require more resources than those available in a single cluster to be treated. However, the degree of complexity of the scheduling process is greatly increased by the resources heterogeneity and the co-allocation process,...
Multi-cluster environments are composed of multiple clusters of computers that act collaboratively, thus allowing computational problems that require more re-sources than those available in a single cluster to be treated. However, the degree of complexity of the scheduling process is greatly increased by the heterogeneity of resources and the co-al...
CodiP2P is a distributed platform for computation based on the peer-to-peer paradigm. This article presents a novel distributed authentication method that suits the platform and adapts to its characteristics. The developed method is based on the Web of Trust paradigm, i.e., not depending on a traditional PKI infrastructure, and focuses on efficienc...
Multi-cluster environments are composed of multiple clusters that act collaboratively, thus allowing computational problems that require more resources than those available in a single cluster to be treated. However, the degree of complexity of the scheduling process is greatly increased by the resources heterogeneity and the co-allocation process,...
Accurate and fast construction of multiple sequence alignments (MSA) is of great signi�cance for such important biological applications as evolution and phylogenetic analysis, homology and domain structure prediction. In such analyses, alignment accuracy is crucial. In this paper we investigate a combined scoring function capable of obtaining a goo...
Multiple Sequence Alignment (MSA) is an extremely powerful tool for important biological applications, such as phylogenetic analysis, identification of conserved motifs and domains and structure prediction. In this paper we propose a new approach to reduce the computational requirements of TCoffee, a memory demanding MSA tool that uses a consistenc...
In the biotechnology field, the deployment of the Multiple Sequence Alignment (MSA) problem, which is a high performance computing demanding process, is one of the new challenges to address on the new parallel systems. The aim of this problem is to find similar regions on biological sequences. Furthermore, the goal of MSA applications is to align a...
Multi-cluster environments are composed of multiple clusters that act collaboratively, thus allowing computational problems that require more resources than those available in a single cluster to be treated. However, the degree of complexity of the scheduling process is greatly increased by the resources heterogeneity and the co-allocation process,...
Multiple Sequence Alignment (MSA) constitutes an extremely powerful tool for important biological applications such as phylogenetic analysis, identification of conserved motifs and domains and structure prediction. In spite of the improvement in speed and accuracy introduced by MSA programs, the computational requirements for large-scale alignments...
Multi-cluster environments are composed of multiple clusters of computers that act collaboratively, and thus allowing computational problems to be treated that require more resources than those available in a single cluster. However, the degree of complexity of the scheduling process is greatly increased by the heterogeneity of resources and co-all...
We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably effective. We also conclude that for a web server with moderate usage (10K hits/month) the cloud provides a cost-effective alternative to in-hous...
We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably effective. We also conclude that for a web server with moderate usage (10K hits/month) the cloud provides a cost-effective alternative to in-hous...
Peer-to-Peer (P2P) computing, the harnessing of idle compute cycles through the Internet, offers new research challenges in the domain of distributed computing. This paper presents CoDiP2P, a Computing Distributed architecture using the P2P paradigm. CoDiP2P allows computing resources from ordinary users to be shared in an open access by means of c...
The exploitation of parallelism in a message passing platform implies a previous modeling phase of the parallel application as a task graph, which properly reflects its temporal behavior. In this paper, we analyze the classical task graph models of the literature and their drawbacks when modeling message passing programs with an arbitrary task stru...
There is a large range of image processing applications that act on an input sequence of image frames that are continuously
received. Throughput is a key performance measure to be optimized when executing them. In this paper we propose a new task
replication methodology for optimizing throughput for an image processing application in the field of m...
Parallelism in applications that act on a stream of input data can be exploited with two different approaches, spatial and temporal. In this paper we propose a new task mapping algorithm, called EXPERT, to exploit temporal parallelism efficiently when the streaming application is running in a pipeline fashion. We compare the performance of spatial...
Pipeline applications simultaneously execute different instances of an input data stream. The iterative behavior of these applications makes the mapping process more difficult than that applied to classical task-parallel or data-parallel applications. In this paper we propose a new mapping algorithm, called ROUTE (resource optimization under throug...
Pipeline applications simultaneously execute different instances from an input data set. Performance parameters for such applications
are latency (the time taken to process an individual data set) and throughput (the aggregate rate at which data sets are processed).
In this paper, we propose a mapping algorithm that improves activity periods for pr...
Simulation frameworks are widely used to carry out performance predictions of parallel programs. In general, these environments do not support the use of automatic mapping mechanisms for assigning tasks to processors. We present a tool called pMAP (predicting the best mapping of parallel applications) that performs the mapping process of message-pa...
The detection and exploitation of different kinds of parallelism, task parallelism and data parallelism often leads to efficient
parallel programs. This paper presents a simulation environment to predict the best mapping for the execution of message-passing
applications on distributed systems. Using this environment, we evaluate the performance of...
The mapping of parallel applications constitutes a difficult problem for which very few practical tools are available. AMEEDA
has been developed in order to overcome the lack of a general-purpose mapping tool. The automatic services provided in AMEEDA
include instrumentation facilities, parameter extraction modules and mapping strategies. With all...
The efficient mapping of parallel tasks is essential in order to exploit the gain from parallelisation. In this work, we focus on modelling and mapping message-passing applications that are defined by the programmer with an arbitrary interaction pattern among tasks. A new model is proposed, known as TTIG (Temporal Task Interaction Graph), which cap...
The efficient mapping of parallel tasks is essential in order to
exploit the gain from parallelisation. In this work, we focus on
modelling and mapping message-passing applications that are defined by
the programmer with an arbitrary interaction pattern among tasks. A new
model is proposed, known as TTIG (Temporal Task Interaction Graph),
which cap...
The mapping of parallel applications constitutes a difficult problem for which very few practical tools are available. AMEEDA has been developed in order to overcome the lack of a general-purpose mapping tool. The automatic services provided in AMEEDA include instrumentation facilities, parameter extraction modules and mapping strategies. With all...
A fundamental issue affecting the performance of parallel applications running on distributed systems is the assignment of tasks to processors. This paper shows the effectiveness in scheduling strategies derived from the use of the temporal behaviour of tasks included in the new TTIG (Temporal Task Interaction Graph) model. Experimentation was perf...
In the distributed processing area, mapping and scheduling are very important issues in order to exploit the gain from parallelization.
The generation of efficient static mapping techniques implies a previous modelling phase of the parallel application as a
task graph, which properly reflects its temporal behaviour. In this paper we use a new model...
An efficient mapping of a parallel program in the processors is vital for achieving a high performance on a parallel computer. When the structure of the parallel program in terms of its task execution times, task dependencies, and amount communication data, is known a priori, mapping can be accomplished statically at compile time. Mapping algorithm...