Emmanuel Cecchet

Emmanuel Cecchet
University of Massachusetts Amherst | UMass Amherst · School of Computer Science

PhD

About

89
Publications
15,459
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,541
Citations
Introduction

Publications

Publications (89)
Conference Paper
Multi-Path TCP (MPTCP) is a new transport protocol that enables systems to exploit available paths through multiple network interfaces. MPTCP is particularly useful for mobile devices, which usually have multiple wireless interfaces. However, these devices have limited power capacity and thus judicious use of these interfaces is required. In this w...
Article
In this paper, we present an open, flexible and realistic benchmarking platform named Video BenchLab to measure the performance of streaming media workloads. While Video BenchLab can be used with any existing media server, we provide a set of tools for researchers to experiment with their own platform and protocols. The components include a MediaDr...
Article
In this demonstration, we present an open, flexible and realistic benchmarking platform named Video BenchLab to measure the performance of streaming media workloads. While Video BenchLab can be used with any existing media server, we provide a set of tools for researchers to experiment with their own platform and protocols. The components include a...
Conference Paper
In this paper, we present mBenchLab, a software infrastructure to measure the Quality of Experience (QoE) on tablet and smartphones accessing cloud hosted Web services. mBenchLab does not rely on emulation but uses real phones and tablets with their original software stack and communication interfaces for performance evaluation. We have used mBench...
Conference Paper
Full-text available
Peer-to-peer networks are the most popular mechanism for the criminal acquisition and distribution of child pornography (CP). In this paper, we examine observations of peers sharing known CP on the eMule and Gnutella networks, which were collected by law enforcement using forensic tools that we developed. We characterize a year's worth of network a...
Article
We investigate the combined effect of application implementation method, container design, and efficiency of communication layers on the performance scalability of J2EE application servers by detailed measurement and profiling of an auction site server. We have implemented five versions of the auction site. The first version uses stateless session...
Article
Full-text available
The goal of the Smart* project is to optimize home energy con-sumption. As part of the project, we have designed and deployed a "live" system that continuously gathers a wide variety of envi-ronmental and operational data in three real homes. In contrast to prior work, our focus has been on sensing depth, i.e., collecting as much data as possible f...
Article
Full-text available
The Cloud is an increasingly popular platform for e-commerce applications that can be scaled on-demand in a very cost effective way. Dynamic provisioning is used to autonomously add capacity in multi-tier cloud-based applications that see workload increases. While many solutions exist to provision tiers with little or no state in applications, the...
Conference Paper
Full-text available
Cloud computing platforms are becoming increasingly popular for e-commerce applications that can be scaled on-demand in a very cost effective way. Dynamic provisioning is used to autonomously add capacity in multi-tier cloud-based applications that see workload increases. While many solutions exist to provision tiers with little or no state in appl...
Conference Paper
Full-text available
Web applications have evolved from serving static content to dynamically generating Web pages. Web 2.0 applications include JavaScript and AJAX technologies that manage increasingly complex interactions between the client and the Web server. Traditional benchmarks rely on browser emulators that mimic the basic network functionality of real Web brow...
Conference Paper
Full-text available
The high replication cost of Byzantine fault-tolerance (BFT) methods has been a major barrier to their widespread adoption in commercial distributed applications. We present ZZ, a new approach that reduces the replication cost of BFT services from 2f+1 to practically f+1. The key insight in ZZ is to use f+1 execution replicas in the normal case and...
Conference Paper
Full-text available
Many businesses rely on Disaster Recovery (DR) services to prevent either manmade or natural disasters from causing expensive service disruptions. Unfortunately, current DR services come either at very high cost, or with only weak guarantees about the amount of data lost or time required to restart operation after a failure. In this work, we argue...
Conference Paper
Full-text available
Today's cloud computing platforms have seen much success in running compute-bound applications with time-varying or one-time needs. In this position paper, we will argue that the cloud paradigm is also well suited for handling data-intensive applications, characterized by the processing and storage of data produced by high-bandwidth sensors or stre...
Conference Paper
Full-text available
Instead of writing SQL queries directly, programmers often prefer writing all their code in a general purpose programming language like Java and having their programs be automatically rewritten to use database queries. Traditional tools such as object-relational mapping tools are able to automatically translate simple navigational queries written i...
Conference Paper
Full-text available
Online Internet applications see dynamic workloads that fluctuate over multiple time scales. This paper argues that the non-stationarity in Internet application workloads, which causes the request mix to change over time, can have a significant impact on the overall processing demands imposed on data center servers. We propose a novel mix-aware dyn...
Article
Full-text available
Household smart meters that measure power consumption in real-time at fine granularities are the foundation of a future smart electricity grid. However, the widespread deployment of smart meters has serious privacy implications since they inadvertently leak detailed information about household ac-tivities. In this paper, we show that even without a...
Article
Full-text available
Continuous "always-on" monitoring is beneficial for a number of applications, but potentially imposes a high load in terms of communication, storage and power consumption when a large number of variables need to be monitored. We introduce two new filtering techniques, swing filters and slide filters, that represent within a prescribed precision a t...
Conference Paper
Full-text available
Many data center virtualization solutions, such as VMware ESX, employ content-based page sharing to consolidate the resources of multiple servers. Page sharing identifies virtual machine memory pages with identical content and consolidates them into a single shared page. This technique, implemented at the host level, applies only between VMs placed...
Article
Many data center virtualization solutions, such as VMware ESX, employ content-based page sharing to consolidate the resources of multiple servers. Page sharing identifies virtual machine memory pages with identical content and consolidates them into a single shared page. This technique, implemented at the host level, applies only between VMs placed...
Conference Paper
Full-text available
With the increasing scale and complexity of data centers, detecting and localizing performance faults in real-time has become both a pressing need and a challenge. While several approaches for performance debugging in data centers have been proposed, these techniques do not assume any constraints on the availability of operational data needed to de...
Article
Full-text available
Many data center virtualization solutions, such as VMware ESX, employ content-based page sharing to consolidate the resources of multiple servers. Page sharing identifies virtual machine memory pages with identical content and consolidates them into a single shared page. This technique, implemented at the host level, applies only between VMs placed...
Article
Full-text available
This paper develops analytical models to predict the throughput and the response time of a replicated database using measurements of the workload on a standalone database. These models allow workload scalability to be estimated before the replicated system is deployed, making the technique useful for capacity planning and dynamic service provisioni...
Article
Full-text available
The current design of database drivers – a necessary evil for interacting with a DBMS – imposes undue burdens on those who install, upgrade, and manage database systems and their applications. In this paper, we introduce Drivolution, a new architecture for DB drivers that reduces the cost, risk, and downtime associated with driver distribution, dep...
Article
Full-text available
Database replication is difficult but indispensable. We report on our experiences building and deploying middleware-based replication systems both as commercial products and research systems. We identify gaps that still separate academic research from industrial practice and thus thwart potential technology transfer from academia to the field. We s...
Article
Full-text available
The report summarizes the results of the Workshop on Middleware Benchmarking held during OOPSLA 2003. The goal of the workshop was to help advance the current practice of gathering performance characteristics of middleware implementations through benchmarking. The participants of the workshop have focused on identifying requirements of and obstacle...
Article
Full-text available
Continuous monitoring of distributed systems is part of the necessary system infras- tructure for a number of applications, including detection o f various anomalies such as failures or performance degradation. The LeWYS project aims at building a monitoring infrastructure to be used by system observers that will implement system-wide strategies to...
Conference Paper
Full-text available
We consider a cluster architecture in which dynamic content is generated by a database back-end and a collection of Web and application server front-ends. We study the effect of transparent query caching on the performance of such a cluster. Transparency requires that cached entries be invalidated as a result of writes. We start with a coarse-grain...
Conference Paper
Full-text available
Open source software has become a common way of disseminating research results. In this talk, we first introduce the motivations and implications of releasing research prototypes as open source software (OSS). ObjectWeb is an international consortium fostering the development of open source middleware. We give an overview of tools available for OSS...
Conference Paper
Full-text available
Clusters have become the de facto platform to scale J2EE application servers. Each tier of the server uses group communication to maintain consistency between replicated nodes. JGroups is the most commonly used Java middleware for group communications in J2EE open source implementations. No evaluation has been done yet to evaluate the scalability o...
Article
Clusters of workstations become more and more popular to power data server applications such as large scale Web sites or e-Commerce applications. Successful open-source tools exist for clustering the front tiers of such sites (web servers and application servers). No comparable success has been achieved for scaling the backend databases. An expensi...
Article
Full-text available
Large web or e-commerce sites are frequently hosted on clusters. Successful open-source tools exist for clustering the front tiers of such sites (web servers and application servers). No comparable success has been achieved for scaling the backend databases. An expensive SMP machine is required if the database tier becomes the bottleneck. The few t...
Conference Paper
Resource management in a Grid computing environment raises several technical issues. The monitoring infrastructure must be scalable, flexible, configurable and adaptable to support thousands of devices in a highly dynamic environment where operational conditions are constantly changing. We propose to address these challenges by combining asynchrono...
Conference Paper
Full-text available
Clustering has become a de facto standard to scale distributed systems and applications. However, the administration and management of such systems still use ad-hoc techniques that partially fulfill the needs. The expertise needed to configure and tune these systems goes beyond the capacity of a single system administrator or software developer.We...
Conference Paper
In this paper, we introduce the concept of Redundant Array of Inexpensive Databases (RAIDb). RAIDb is to databases what RAID is to disks. RAIDb aims at providing better performance and fault tolerance than a single database, at low cost, by combining multiple database instances into an array of databases. Like RAID, we define and compare different...
Article
Full-text available
Clusters have become the de facto platform to scale J2EE application servers. Each tier of the server uses group communication to maintain consistency between replicated nodes. JGroups is the most commonly used Java middleware for group communications in J2EE open source implementations. No evaluation has been done yet to evaluate the scalability o...
Conference Paper
Full-text available
Middleware has emerged as an important architectural component in modern distributed systems. It provides many solutions allowing to hide the management of the distribution of services and computations to the developers. However, its configuration becomes more and more complex, since it must fit application requirements, while adapting to the under...
Conference Paper
Clusters of workstations become more and more popular to power data server applications such as large scale Web sites or e-Commerce applications. There has been much research on scaling the front tiers (web servers and application servers) using clusters, but databases usually remain on large dedicated SMP machines. In this paper, we focus on the d...
Conference Paper
Full-text available
On-line services are making increasing use of dynamically generated Web content. Serving dynamic content is more complex than serving static content. Besides a Web server, it typically involves a server-side application and a database to generate and store the dynamic content. A number of standard mechanisms have evolved to generate dynamic content...
Article
On-line services are making increasing use of dynamically generated Web content. Serving dynamic content is more complex than serving static content. Besides a Web server, it typically involves a server-side application and a database to generate and store the dynamic content. A number of standard mechanisms have evolved to generate dynamic content...
Conference Paper
Full-text available
The absence of benchmarks for Web sites with dynamic content has been a major impediment to research in this area. We describe three benchmarks for evaluating the performance of Web sites with dynamic content. The benchmarks model three common types of dynamic content Web sites with widely varying application characteristics: an online bookstore, a...
Article
The absence of benchmarks for Web sites with dynamic content has been a major impediment to research in this area. We describe three benchmarks for evaluating the performance of Web sites with dynamic content. The benchmarks model three common types of dynamic content Web sites with widely varying application characteristics: an online bookstore, a...
Article
Full-text available
We investigate the combined effect of application implementation method, container design, and efficiency of communication layers on the performance scalability of J2EE application servers by detailed measurement and profiling of an auction site server. We have implemented five versions of the auction site. The first version uses stateless session...
Article
In this paper, we present Whoops!, a clustered web cache prototype based on SciFS, a Distributed Shared Memory (DSM) that benefits from the high performances and the remote addressing capabilities of memory mapped networks like Scalable Coherent Interface (SCI). Whoops! uses the DSM for all web cache management and cache storage. Using a memory map...
Article
New memory mapped network interfaces offers both low latency and high bandwith communications. This has implications on the design and implementation of distributed operating systems, especially with respect to global management of resources.
Conference Paper
Full-text available
We present Whoops!, a clustered Web cache prototype based on SciFS, a distributed shared memory (DSM) that benefits from the high performances and the remote addressing capabilities of memory mapped networks like Scalable Coherent Interface (SCI). Whoops! uses the DSM for all Web cache management and cache storage. Using a memory mapped network and...
Conference Paper
Full-text available
Distributed Shared Memories (DSM) performance has always suffered from high network latencies and software communication layers with a large overhead. Memory mapped networks such as Scalable Coherent Interface (SCI) allow to reliably access remote memory without involving the operating system. To show how DSM systems can benefit from this technolog...
Article
Full-text available
New memory mapped network interfaces offers both low latency and high bandwith communications. This has implications on the design and implementation of distributed operating systems, especially with respect to global management of resources. This paper presents Kaffemik, a scalable distributed Java Virtual Machine, providing the programmer with a...
Article
Full-text available
In this paper we consider the use of a supercomputer with a hardware shared memory versus a cluster of workstations using a software Distributed Shared Memory (DSM). We focus on ray tracing applications to compare both architectures. We have ported Stingray, a parallel cone tracer developed on a SGI Origin 2000 supercomputer, on a cluster using a S...
Article
Full-text available
Java has rapidly gained a large user group and has become one of the most popular platforms for developing Internet applications, e. g., servlets. The Java VM is extensively used to execute servlets in web servers and Enterprise Java Beans in application servers. Due to Java's platform independence, it has also gained large acceptance in the open-s...
Article
Full-text available
The SIRAC laboratory has set up a cluster of PCs interconnected by a Scalable Coherent Interface (SCI) network. Such new high bandwidth and low latency memory-mapped networks can be used to implement efficient distributed shared memories (DSM). We have developed SciFS, a DSM tightly integrated with the operating system, that tries to benefit from t...
Conference Paper
Full-text available
The SIRAC laboratory has developed SciFS, a distributed shared memory (DSM) that tries to benefit from the high performance and the remote addressing capabilities of the scalable coherent interface (SCI) memory mapped network. We use SciFS for high performance cluster computing and we also experiment with it to build large scale clustered Web cache...
Article
Full-text available
Java is increasingly used to develop large server applications. In order to provide powerful platforms for such applications a number of projects have proposed Java Virtual Machines (JVMs) that are based on network of workstations. These JVMs employ the message-passing paradigm, i.e. all communication between the distributed instances of the virtua...