
Pedro García López- PhD
- Professor (Assistant) at Universitat Rovira i Virgili
Pedro García López
- PhD
- Professor (Assistant) at Universitat Rovira i Virgili
About
178
Publications
22,375
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,851
Citations
Introduction
Current institution
Publications
Publications (178)
Function-as-a-Service (FaaS) struggles with burst-parallel jobs due to needing multiple independent invocations to start a job. The lack of a group invocation primitive complicates application development and overlooks crucial aspects like locality and worker communication. We introduce a new serverless solution designed specifically for burst-para...
Access transparency means that both local and remote resources are accessed using identical operations. With transparency, unmodified single-machine applications could run over disaggregated compute, storage, and memory resources. Hiding the complexity of distributed systems through transparency would have great benefits, like scaling-out local-par...
Serverless computing greatly simplifies the use of cloud resources. In particular, Function-as-a-Service (FaaS) platforms enable programmers to develop applications as individual functions that can run and scale independently. Unfortunately, applications that require fine-grained support for mutable state and synchronization, such as machine learni...
Serverless computing, in particular the Function-as-a-Service (FaaS) execution model, has recently shown to be effective for running large-scale computations. However, little attention has been paid to highly-parallel applications with unbalanced and irregular workloads. Typically, these workloads have been kept out of the cloud due to the impossib...
Serverless functions provide high levels of parallelism, short startup times, and "pay-as-you-go" billing. These attributes make them a natural substrate for data analytics workflows. However, the impossibility of direct communication between functions makes the execution of workflows challenging. The current practice to share intermediate data amo...
Access transparency means that both local and remote resources are accessed using identical operations. With transparency, unmodified single-machine applications could run over disaggregated compute, storage, and memory resources. Hiding the complexity of distributed systems through transparency would have great benefits, like scaling-out local-par...
Serverless computing has become very popular today since it largely simplifies cloud programming. Developers do no longer need to worry about provisioning or operating servers, and they have to pay only for the compute resources used when their code is run. This new cloud paradigm suits well for many applications, and researchers have already begun...
Serverless functions provide high levels of parallelism, short startup times, and "pay-as-you-go" billing. These attributes make them a natural substrate for data analytics workflows. However, the impossibility of direct communication between functions makes the execution of workflows challenging. The current practice to share intermediate data amo...
Unexpectedly, the rise of serverless computing has also collaterally started the ‘'democratization’' of massive-scale data parallelism. This new trend heralded by PyWren pursues to enable untrained users to execute single-machine code in the cloud at massive scale through platforms like AWS Lambda. Driven by this vision, this article presents Litho...
As more applications are being moved to the Cloud thanks to serverless computing, it is increasingly necessary to support the native life cycle execution of those applications in the data center. But existing cloud orchestration systems either focus on short-running workflows (like IBM Composer or Amazon Step Functions Express Workflows) or impose...
As more applications are being moved to the Cloud thanks to serverless computing, it is increasingly necessary to support the native life cycle execution of those applications in the data center.
But existing cloud orchestration systems either focus on short-running workflows (like IBM Composer or Amazon Step Functions Express Workflows) or impose...
Serverless computing has seen a myriad of work exploring its potential. Some systems tackle Function-as-a-Service (FaaS) properties on automatic elasticity and scale to run highly-parallel computing jobs. However, they focus on specific platforms and convey that their ideas can be extrapolated to any FaaS runtime.
An important question arises: do a...
Within the next 10 years, advances on resource disaggregation will enable full transparency for most Cloud applications: to run unmodified single-machine applications over effectively unlimited remote computing resources. In this article, we present five serverless predictions for the next decade that will realize this vision of transparency -- equ...
Serverless computing has seen a myriad of work exploring its potential. Some systems tackle Function-as-a-Service (FaaS) properties on automatic elasticity and scale to run highly-parallel computing jobs. However, they focus on specific platforms and convey that their ideas can be extrapolated to any FaaS runtime. An important question arises: do a...
While serverless computing is changing the way applications run in the cloud, vendor lock-in limits the adoption of multiple serverless computing platforms. Furthermore, the complexity of different Cloud APIs preclude massive adoption of Cloud platforms. In this paper we advocate for access transparency: enabling local and remote resources to be ac...
As more applications are being moved to the Cloud thanks to serverless computing, it is increasingly necessary to support native life cycle execution of those applications in the data center. But existing systems either focus on short-running workflows (like IBM Composer or Amazon Express Workflows) or impose considerable overheads for synchronizin...
For many years, the distributed systems community has struggled to smooth the transition from local to remote computing. Transparency means concealing the complexities of distributed programming like remote locations, failures or scaling. For us, full transparency implies that we can compile, debug and run unmodified single-machine code over effect...
Function as a Service (FaaS) is based on a reactive programming model where functions are activated by triggers in response to cloud events (e.g., objects added to an object store). The inherent elasticity and the pay-per-use model of serverless functions make them very appropriate for embarrassingly parallel tasks like data preprocessing, or even...
Serverless computing is an emerging paradigm that greatly simplifies the usage of cloud resources and suits well to many tasks. Most notably, Function-as-a-Service (FaaS) enables programmers to develop cloud applications as individual functions that can run and scale independently. Yet, due to the disaggregation of storage and compute resources in...
Serverless computing has become very popular today since it largely simplifies cloud programming. Developers do not need to longer worry about provisioning or operating servers, and they pay only for the compute resources used when their code is run. This new cloud paradigm suits well for many applications, and researchers have already begun invest...
The old mantra of decentralizing the Internet is coming again with fanfare, this time around the blockchain technology hype. We have already seen a technology supposed to change the nature of the Internet: peer-to-peer. The reality is that peer-to-peer naming systems failed, peer-to-peer social networks failed, and yes, peer-to-peer storage failed...
Object stores are becoming pervasive due to their scalability and simplicityto manage data growth. Their rampant adoption, however, contrasts with their scant flexibility to support multi-tenancy. Very often, this results in a deficient adaptation of the system to the heterogeneous tenants’ demands and the multiple applications sharing the same obj...
Unexpectedly, the rise of serverless computing has also collaterally started the "democratization" of massive-scale data parallelism. This new trend heralded by PyWren pursues to enable untrained users to execute single-machine code in the cloud at massive scale through platforms like AWS Lambda. Inspired by this vision, this industry paper present...
Since the appearance of Amazon Lambda in 2014, all major cloud providers have embraced the Function as a Service (FaaS) model, because of its enormous potential for a wide variety of applications. As expected (and also desired), the competition is fierce in the serverless world, and includes aspects such as the run-time support for the orchestratio...
In many online storage services, end-users mainly interact with the system via "fat" storage clients that integrate complex functionality. This means that to obtain a complete performance evaluation of one of such systems we may need to generate workloads on the client side that reproduce the behavior of real users. Unfortunately, this remains as a...
Personal Cloud (PC) storage services such as Dropbox or Google Drive have become increasingly popular in the last few years. Unfortunately, these services are still not secure. Even assuming “perfect” data confidentiality, securely sharing a folder in these services is still an issue, and this without mentioning the fact that the existing sharing m...
Cloud storage services like Dropbox, Google Drive, and OneDrive, to cite a few, are becoming an increasingly “vital” tool in our everyday life. Unluckily, these services can incur large network overhead in different usage scenarios. To reduce it, these systems utilize several techniques like source-based deduplication, chunking, delta compression,...
In this paper we claim that your private opinions cannot be controlled by a single centralized entity. Some examples of this are user participation in open polls or rating (stars, like/dislike) services and persons in a community. To this aim, we present TallyNetworks, an edge-centric distributed overlay that aims to provide end-to-end verifiabilit...
Extracting value from data stored in object stores,such as OpenStack Swift and Amazon S3, can be problematicin common scenarios where analytics frameworks and objectstores run in physically disaggregated clusters. One of the mainproblems is that analytics frameworks must ingest large amountsof data from the object store prior to the actual computat...
Extracting value from data stored in object stores, such as OpenStack Swift and Amazon S3, can be problematic in common scenarios where analytics frameworks and object stores run in physically disaggregated clusters. One of the main problems is that analytics frameworks must ingest large amounts of data from the object store prior to the actual com...
The abundance of computing technologies and devices imply that we will live in a data-driven society in the next years. But this data-driven society requires radically new technologies in the data center to deal with data manipulation, transformation, access control, sharing and placement, among others.
Personal cloud storage systems are revolutionizing the way people think about and access their files. As the prevailing model, these systems use unicast to push file changes to each of the "unsynced" devices. As a result, they transmit multiple times the same information. This puts an unnecessary strain on outgoing bandwidth at the datacenters. One...
Data sharing in Personal Clouds blurs the lines between on-line storage and content distribution with a strong social component. Such social information may be exploited by researchers to devise optimized data management techniques for Personal Clouds. Unfortunately, due their proprietary nature, data sharing is one of the least studied facets of t...
Software-defined storage (SDS) aims to minimize thecomplexity of data management in the Cloud. SDS decouples thecontrol plane from the data plane and simplifies the managementof the storage system via automated storage policy enforcement.In this paper, we propose a novel SDS framework for ObjectStorage that allows to decentralize policy enforcement...
As the complexity and scale of cloud storage systems grow, software-defined storage (SDS) has become a prime candidate to simplify cloud storage management. Here, the authors present IOStack, the first SDS architecture for object stores (such as OpenStack Swift). At the control plane, the provisioning of SDS services to tenants is made according to...
Users are unceasingly relying on personal clouds (like Dropbox, Box, etc) to store, edit and retrieve their files stored in remote servers. These systems generally follow a client-server model to distribute the files to end-users. This means that they require a huge amount of bandwidth to meet the requirements of their clients. Personal clouds with...
Personal Cloud services, such as Dropbox or Box, have been widely adopted by users. Unfortunately, very little is known about the internal operation and general characteristics of
Personal Clouds since they are proprietary services.
In this paper, we focus on understanding the nature of Personal Clouds by presenting the internal structure and a me...
In many aspects of human activity, there has been a continuous struggle between the forces of centralization and decentralization. Computing exhibits the same phenomenon; we have gone from mainframes to PCs and local networks in the past, and over the last decade we have seen a centralization and consolidation of services and applications in data c...
Designing and validating large-scale distributed systems is still a complex issue. The asynchronous event-based nature of distributed communications makes these systems complex to implement, debug and test. In this article, we introduce the continuation complexity problem, that arises when synchronous invocations must be converted to asynchronous e...
The design of elastic file synchronization services like Dropbox is an open and complex issue yet not unveiled by the major commercial providers, as it includes challenges like fine-grained programmable elasticity and efficient change notification to millions of devices. In this paper, we propose a novel architecture for file synchronization which...
Lately Personal Cloud storage services, like Drop-box, have emerged as user-centric solutions that provide easy management of the users' data. To meet the requirements of their clients, such services require a huge amount of storage and bandwidth. In an attempt to reduce these costs, we focus on maximizing the benefit that can be driven from the in...
The integration of business processes into existing applications involves considerable development efforts and costs for IT departments. This precludes the pervasive implementation of BPM in organizations where important applications remain isolated from the existing workflows.
In this paper, we introduce a novel concept, Workflow Weaving, based on...
Cloud content providers must deliver vast amounts of data to an ever-growing number of users while maintaining responsive performance, thus increasing bandwidth-provisioning expenditures. To mitigate this problem, the authors transparently integrate BitTorrent into the cloud provider infrastructure and leverage users' upstream capacity to reduce ba...
In-line deduplication clusters provide high throughput and scalable storage/archival services to enterprises and organizations. Unfortunately, high throughput comes at the cost of activating several storage nodes on each request, due to the parallel nature of superchunk routing. This may prevent storage nodes from exploiting disk standby times to p...
In the last few years, we have seen a rapid expansion of social networking. Digital relationships between individuals are becoming capital for turning to one another for communication and collaboration. These online relationships are creating new opportunities to define socially oriented computing models. In this paper, we propose to leverage these...
In classic storage services, the transfer protocol used is usually HTTP. This means that all download requests are handled by a central server which sends the requested files in a single stream. But, such transfer is limited by the narrowest network condition along the way, or by the server being overloaded by requests from many clients. In this co...
The Personal Cloud model is a mainstream service that meets the growing demand of millions of users for reliable off-site storage. However, despite their broad adoption, very little is known about the quality of service (QoS) of Personal Clouds.
In this paper, we present a measurement study of three major Personal Clouds: DropBox, Box and SugarSyn...
Personal Clouds, such as DropBox and Box, provide open REST APIs for developers to create clever applications that make their service even more attractive. These APIs are a powerful abstraction that makes it possible for applications to transparently manage data from user accounts, blurring the lines between a Personal Cloud service and storage Iaa...
Over the last years we have seen the proliferation of many new popular web applications, which are commonly used on a daily basis by most of us. The challenges that have to be overcome by web application designers include how to make these applications support as much concurrent users as possible, without degrading application’s performance, and wi...
Peer-to-peer (P2P) storage systems aggregate spare storage resources from end users to build a large collaborative online storage solution. In these systems, however, the high levels of user churn—peers failing or leaving temporarily or permanently—affect the quality of the storage service and might put data reliability on risk. Indeed, one of the...
Personal storage is a mainstream service used by millions of users. Among the existing alternatives, Friend-to-Friend (F2F) systems are aimed to leverage a secure and private off-site storage service. However, the specific characteristics of these systems (reduced node degree, correlated availabilities) represent a hard obstacle to their performanc...
Nowadays, the growing necessity for secure and private off-site storage motivates the appearance of novel storage infrastructures. In this sense, it is increasingly common to find storage systems where users interact just with a set of' trustworthy participants, such as in Friend-to-Friend (F2F) networks. In general, these systems have been treated...
Personal storage is a mainstream service used by millions of users. Among the existing alternatives, Friend-toFriend (F2F) systems are nowadays an interesting research topic aimed to leverage a secure and private off-site storage service.
However, the specific characteristics of F2F storage systems (reduced node degree, correlated availabilities) r...
The increasing popularity of Cloud storage services is leading end-users to store their digital lives (including photos, videos, work documents, etc.) in the Cloud. However, many users are still reluctant to move their data to the Cloud due to the amount of control ceded to Cloud vendors. To let users retain the control over their data, Friend-to-F...
The conversion of legacy single-user applications into collaborative multi-user tools is a recurrent topic in groupware scenarios. Many recent literature works have tried to achieve transparent collaboration, which consists of enabling collaborative features without modifying the original application source code.In this paper, we define the availab...
In this paper we discuss how we improved the MChannel group communication middleware for Mobile Ad-hoc Networks (MANETs) in order to let it become both delay- and energy-aware. MChannel makes use of the Optimized Link State Routing (OLSR) protocol, which is natively based on a simple hop-count metric for the route selection process. Based on such m...
The development of distributed applications in large-scale environments has always been a complex task. In order to guarantee non-functional properties like scalability or availability, developers are usually faced with the same problems over and over again. These problems can be separated in distributed concerns, as for example, distribution, load...
Neuropsychological Rehabilitation is a complex clinic process which tries to restore or compensate cognitive and behavioral disorders in people suffering from a central nervous system injury. Information and Communication Technologies (ICTs) in Biomedical Engineering play an essential role in this field, allowing improvement and expansion of presen...
In P2P storage systems peers need to contribute some local storage resources in order to obtain a certain online and reliable storage capacity. To guarantee that the storage service works, P2P storage systems have to meet two main requirements. First, the storage system needs to maintain fairness among peers by ensuring that peers consuming more on...
In this paper we present the concept of community downloads as a mechanism to improve the overall performance of BitTorrent clients. A community is a group of nodes interested in the same content working cooperatively inside a swarm. To reinforce this cooperation among community nodes, we designed two new algorithms: Group Rarest-First (piece selec...
Acquired Brain Injury (ABI), either caused by vascular or traumatic nature, is one of the most important causes for neurological disabilities. People who suffer ABI see how their quality of life decreases, due to the affection of one or some of the cognitive functions (memory, attention, language or executive functions). The traditional cognitive r...
Churn is an inherent property of peer-to-peer (P2P) networks which is difficult to characterize due to the fact that few systems are actually in use, and the space of possible applications is still under scrutiny. Despite its relevance, and even though there is a plethora of representative models, it is an open question how faithfully those models...
Distributed Hash Tables (DHTs) have been used as a common building block in many distributed applications, including Cloud and Grid. However, there are still important security vulnerabilities that hinder their adoption in today's large-scale computing platforms. For instance, routing vulnerabilities have been a subject of intensive research but ex...
Nowadays, data storage requirements from end-users are growing, demanding more capacity, more reliability and the capability to access information from anywhere. Cloud storage services meet this demand by providing transparent and reliable storage solutions. Most of these solutions are built on distributed infrastructures that rely on data redundan...
In this paper we present UDON, a novel Utility Driven Overlay Network framework for routing service requests in highly dynamic
large scale shared infrastructures. UDON combines an application provided utility function to express the services’s QoS in
a compact way, with an epidemic protocol to disseminate this information in a scalable and robust w...
The conversion of legacy single-user applications to collaborative multi-user tools is a recurrent topic in groupware settings.
Many works tried to achieve collaboration transparency: to enable collaborative features without modifying the source code
of the single-user application. In this paper, we present a novel blackbox solution that achieves...
A common factor among all the existing distributed, peer-to-peer systems is their lack of genericity. Typically, information-centric services (such as range queries) are deployed ad-hoc onto a specific peer-to-peer overlay. These kinds of solutions make them probably efficient but non-portable to other peer-to-peer infrastructures, and so the servi...
Mobile Ad Hoc Network (MANET) middleware must be aware of the underlying multi-hop topology to self-adapt and to improve its communication efficiency. For this reason, many approaches rely on specific cross-layer communications to interact with the network protocols in the kernel space. But these solutions break the strict layering of the network s...
Nowadays, we are witnessing an increasing growth of Web 2.0 content such as micronews, blogs and RSS feeds. This trend exemplified by applications like Twitter and LiveJournal is starting to slow down not only by the limitations of existing services – proprietary and centralized, but also by the cumbersome process of discovering and tracking intere...
Peer-to-peer (P2P) storage systems are strongly affected by churn - temporal and permanent peer failures. Because of this churn, the main requirement of such systems is to guarantee that stored objects can always be retrieved. This requirement is specially needed in two main situations: when users want to access the stored objects or when data main...
Churn is an inherent property of peer-to-peer (P2P) networks. Despite its relevance, yet, there is not a universal tool to bring researchers the opportunity to compare their contributions under the same general conditions. To fill this gap, we present the first open-source, simulator-independent tool for churn modeling.
We present here a brief summary of the works presented this year to the COPS 2010 workshop. The COPS workshop is in its sixth edition and it attracts top researchers from the peer-to-peer field. This year the workshop had an acceptance rate of 40% with 6 full papers accepted from a total of 15 submissions.