[show abstract][hide abstract] ABSTRACT: Secure, fault-tolerant distributed systems are difficult to build, to validate, and to operate. Conservative design for such systems dictates that their security and fault tolerance depend on a very small number of assumptions taken on faith; such assumptions are typically called the "trusted computing base" (TCB) of a system. However, a rich trade-off exists between larger TCBs and more secure, more faulttolerant, or more efficient systems. In our recent work, we have explored this trade-off by defining "small," generic trusted primitives--for example, an attested, monotonically sequenced FIFO buffer of a few hundred machine words guaranteed to hold appended words until eviction and showing how such primitives can improve the performance, fault tolerance, and security of systems using them. In this article, we review our efforts in generating simple trusted primitives such as an attested circular buffer (called Attested Appendonly Memory), and an attested human activity detector. We describe the benefits of using these primitives to increase the fault-tolerance of replicated systems and archival storage, and to improve the security of email SPAM and click-fraud prevention systems. Finally, we share some lessons we have learned from this endeavor.
[show abstract][hide abstract] ABSTRACT: Mobile applications are becoming increasingly ubiquitous and provide ever richer functionality on mobile devices. At the same time, such devices often enjoy strong connectivity with more powerful machines ranging from laptops and desktops to commercial clouds. This paper presents the design and implementation of CloneCloud, a system that automatically transforms mobile applications to benefit from the cloud. The system is a flexible application partitioner and execution runtime that enables unmodified mobile applications running in an application-level virtual machine to seamlessly off-load part of their execution from mobile devices onto device clones operating in a computational cloud. CloneCloud uses a combination of static analysis and dynamic profiling to partition applications automatically at a fine granularity while optimizing execution time and energy use for a target computation and communication environment. At runtime, the application partitioning is effected by migrating a thread from the mobile device at a chosen point to the clone in the cloud, executing there for the remainder of the partition, and re-integrating the migrated thread back to the mobile device. Our evaluation shows that CloneCloud can adapt application partitioning to different environments, and can help some applications achieve as much as a 20x execution speed-up and a 20-fold decrease of energy spent on the mobile device.
European Conference on Computer Systems, Proceedings of the Sixth European conference on Computer systems, EuroSys 2011, alzburg, Austria - April 10-13, 2011; 01/2011
[show abstract][hide abstract] ABSTRACT: We present Mantis, a new framework that automatically predicts program performance with high accuracy. Mantis integrates techniques from programming language and machine learning for performance modeling, and is a radical departure from traditional approaches. Mantis extracts program features, which are information about program execution runs, through program instrumentation. It uses machine learning techniques to select features relevant to performance and creates prediction models as a function of the selected features. Through program analysis, it then generates compact code slices that compute these feature values for prediction. Our evaluation shows that Mantis can achieve more than 93% accuracy with less than 10% training data set, which is a significant improvement over models that are oblivious to program features. The system generates code slices that are cheap to compute feature values.
[show abstract][hide abstract] ABSTRACT: Mobile applications are becoming increasingly ubiquitous and provide ever richer functionality on mobile devices. At the same time, such devices often enjoy strong connectivity with more powerful machines ranging from laptops and desktops to commercial clouds. This paper presents the design and implementation of CloneCloud, a system that automatically transforms mobile applications to benefit from the cloud. The system is a flexible application partitioner and execution runtime that enables unmodified mobile applications running in an application-level virtual machine to seamlessly off-load part of their execution from mobile devices onto device clones operating in a computational cloud. CloneCloud uses a combination of static analysis and dynamic profiling to optimally and automatically partition an application so that it migrates, executes in the cloud, and re-integrates computation in a fine-grained manner that makes efficient use of resources. Our evaluation shows that CloneCloud can achieve up to 21.2x speedup of smartphone applications we tested and it allows different partitioning for different inputs and networks.
[show abstract][hide abstract] ABSTRACT: This paper introduces the notion of a secure data capsule, which refers to an encapsulation of sensitive user information (such as a credit card number) along with code that implements an interface suitable for the use of such information (such as charging for purchases) by a service (such as an online merchant). In our capsule framework, users provide their data in the form of such capsules to web services rather than raw data. Capsules can be deployed in a variety of ways, either on a trusted third party or the user's own computer or at the service itself, through the use of a variety of hardware or software modules, such as a virtual machine monitor or trusted platform module: the only requirement is that the deployment mechanism must ensure that the user's data is only accessed via the interface sanctioned by the user. The framework further allows an user to specify policies regarding which services or machines may host her capsule, what parties are allowed to access the interface, and with what parameters. The combination of interface restrictions and policy control lets us bound the impact of an attacker who compromises the service to gain access to the user's capsule or a malicious insider at the service itself.
Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, Vancouver, British Columbia, Canada.; 01/2010
[show abstract][hide abstract] ABSTRACT: Reducing energy consumption in datacenters is key to building low cost datacenters. To address this challenge, we explore the potential of hybrid datacenter designs that mix low power platforms with high performance ones. We show how these designs can handle diverse workloads with different service level agreements in an energy efficient fashion. We evaluate the feasibility of our approach through experiments and then discuss the design challenges and options of hybrid datacenters.
[show abstract][hide abstract] ABSTRACT: Mobile cloud computing applications run diverse workloads under diverse device platforms, networks, and clouds. Traditionally these applications are statically partitioned between weak devices and clouds, thus may be significantly inefficient in heterogeneous environments and workloads. We introduce the notion of dynamic partitioning of applications between weak devices and clouds and argue that this is key to addressing heterogeneity problems. We formulate the dynamic partitioning problem and discuss major research challenges around system support for dynamic partitioning.
[show abstract][hide abstract] ABSTRACT: We revisit the problem of scaling software routers, motivated by recent advances in server technology that enable high-speed parallel processing---a feature router workloads appear ideally suited to exploit. We propose a software router architecture that parallelizes router functionality both across multiple servers and across multiple cores within a single server. By carefully exploiting parallelism at every opportunity, we demonstrate a 35Gbps parallel router prototype; this router capacity can be linearly scaled through the use of additional servers. Our prototype router is fully programmable using the familiar Click/Linux environment and is built entirely from off-the-shelf, general-purpose server hardware.
[show abstract][hide abstract] ABSTRACT: Fault-tolerant services typically make assumptions about the type and maximum number of faults that they can tolerate while providing their correctness guarantees; when such a fault threshold is violated, correctness is lost. We revisit the notion of fault thresholds in the context of long-term archival storage. We observe that fault thresholds are inevitably violated in long- term services, making traditional fault tolerance inapplicable to the long-term. In this work, we undertake a "reallocation of the fault-tolerance budget" of a long-term service. We split the service into service pieces, each of which can tolerate a dif- ferent number of faults without failing (and without causing the whole service to fail): each piece can be either in a critical trusted fault tier, which must never fail, or an untrusted fault tier, which can fail massively and often, or other fault tiers in between. By carefully engineering the split of a long-term ser- vice into pieces that must obey distinct fault thresholds, we can prolong its inevitable demise. We demonstrate this approach with Bonafide, a long-term key-value store that, unlike all simi- lar systems proposed in the literature, maintains integrity in the face of Byzantine faults without requiring self-certified data. We describe the notion of tiered fault tolerance, the design, im- plementation, and experimental evaluation of Bonafide, and ar- gue that our approach is a practical yet significant improvement over the state of the art for long-term services.
7th USENIX Conference on File and Storage Technologies, February 24-27, 2009, San Francisco, CA, USA. Proceedings; 01/2009
[show abstract][hide abstract] ABSTRACT: Enterprise and data center networks consist of a large number of complex networked applications and services that depend upon each other. For this reason, they are difficult to manage and diagn ose. In this paper we propose Macroscope, a new approach to extracting the dependencies of networked applications automatically by com- bining application process information with network level packet traces. We evaluate Macroscope on traces collected at 52 laptops within a large enterprise and show that Macroscope is accurate in finding the dependencies of networked applications. We also show that Macroscope requires less human involvement and is signifi- cantly more accurate than state of the art approaches that use only packet traces. Using our rich profiles of the application-se rvice dependencies, we explore and uncover some interesting character- istics about this relationship. Finally, we discuss severa l usage sce- narios that can benefit from Macroscope.
Proceedings of the 2009 ACM Conference on Emerging Networking Experiments and Technology, CoNEXT 2009, Rome, Italy, December 1-4, 2009; 01/2009
[show abstract][hide abstract] ABSTRACT: Clustered applications in storage area networks (SANs), widely adopted in enterprise datacenters, have tradition- ally relied on distributed locking protocols to coordi- nate concurrent access to shared storage devices. We examine the semantics of traditional lock services for SAN environments and ask whether they are sufficient to guarantee data safety at the application level. We ar- gue that a traditional lock service design that enforces strict mutual exclusion via a globally-consistent view of locking state is neither sufficient nor strictly necessary to ensure application-level correctness in the presence of asynchrony and failures. We also argue that in many cases, strongly-consistent locking imposes an additional and unnecessary constraint on application availability. Armed with these observations, we develop a set of novel concurrency control and recovery protocols for clustered SAN applications that achieve safety and liveness in the face of arbitrary asynchrony, crash failures, and network partitions. Finally, we present and evaluate Minuet- a new synchronization primitive based on these protocols that can serve as a foundational building block for safe and highly-available SAN applications.
7th USENIX Conference on File and Storage Technologies, February 24-27, 2009, San Francisco, CA, USA. Proceedings; 01/2009
[show abstract][hide abstract] ABSTRACT: Smartphones enable a new, rich user experience in per- vasive computing, but their hardware is still very lim- ited in terms of computation, memory, and energy re- serves, thus limiting potential applications. In this pape r, we propose a novel architecture that addresses these chal- lenges via seamlessly—but partially—off-loading execu- tion from the smartphone to a computational infrastruc- ture hosting a cloud of smartphone clones. We outline new augmented execution opportunities for smartphones en- abled by our CloneCloud architecture.
Proceedings of HotOS'09: 12th Workshop on Hot Topics in Operating Systems, May 18-20, 2009, Monte Verità, Switzerland; 01/2009
[show abstract][hide abstract] ABSTRACT: The systems and networking community treasures "sim- ple" system designs, but our evaluation of system sim- plicity often relies more on intuition and qualitative dis- cussion than rigorous quantitative metrics. In this paper, we develop a prototype metric that seeks to quantify the notion of algorithmic complexity in networked system design. We evaluate several networked system designs through the lens of our proposed complexity metric and demonstrate that our metric quantitatively assesses so- lutions in a manner compatible with informally artic- ulated design intuition and anecdotal evidence such as real-world adoption.
5th USENIX Symposium on Networked Systems Design & Implementation, NSDI 2008, April 16-18, 2008, San Francisco, CA, USA, Proceedings; 01/2008
[show abstract][hide abstract] ABSTRACT: New single-machine environments are emerging from abundant computation available through multiple cores and secure virtualization. In this paper, we describe the research challenges and opportunities around diversified replication as a method to increase the Byzantine-fault tolerance (BFT) of single-machine servers to software at- tacks or errors. We then discuss the design space of BFT protocols enabled by these new environments.
2008 USENIX Annual Technical Conference, Boston, MA, USA, June 22-27, 2008. Proceedings; 01/2008
[show abstract][hide abstract] ABSTRACT: Software routers can lead us from a network of special-purpose hardware routers to one of general-purpose extensible infrastructure--if, that is, they can scale to high speeds. We identify the challenges in achieving this scalability and propose a solution: a cluster-based router architecture that uses an interconnect of commodity server platforms to build software routers that are both incrementally scalable and fully programmable.
[show abstract][hide abstract] ABSTRACT: Antiquity is a wide-area distributed storage system designed to provide a simple storage service for applications like file systems and back-up. The design assumes that all servers eventually fail and attempts to maintain data despite those failures. Antiquity uses a secure log to maintain data integrity, replicates each log on multiple servers for durability, and uses dynamic Byzantine fault- tolerant quorum protocols to ensure consistency among replicas. We present Antiquity's design and an experimental evaluation with global and local testbeds. Antiquity has been running for over two months on 400+ PlanetLab servers storing nearly 20,000 logs totaling more than 84 GB of data. Despite constant server churn, all logs remain durable.
Proceedings of the 2007 EuroSys Conference, Lisbon, Portugal, March 21-23, 2007; 01/2007
[show abstract][hide abstract] ABSTRACT: Researchers have made great strides in improving the fault tolerance of both centralized and replicated systems against arbitrary (Byzantine) faults. However, there are hard limits to how much can be done with entirely untrusted components; for example, replicated state machines cannot tolerate more than a third of their replica population being Byzantine. In this paper, we investigate how minimal trusted abstractions can push through these hard limits in practical ways. We propose Attested Append-Only Memory (A2M), a trusted system facility that is small, easy to implement and easy to verify formally. A2M provides the programming abstraction of a trusted log, which leads to protocol designs immune to equivocation -- the ability of a faulty host to lie in different ways to different clients or servers -- which is a common source of Byzantine headaches. Using A2M, we improve upon the state of the art in Byzantine-fault tolerant replicated state machines, producing A2M-enabled protocols (variants of Castro and Liskov's PBFT) that remain correct (linearizable) and keep making progress (live) even when half the replicas are faulty, in contrast to the previous upper bound. We also present an A2M-enabled single-server shared storage protocol that guarantees linearizability despite server faults. We implement A2M and our protocols, evaluate them experimentally through micro- and macro-benchmarks, and argue that the improved fault tolerance is cost-effective for a broad range of uses, opening up new avenues for practical, more reliable services.
Proceedings of the 21st ACM Symposium on Operating Systems Principles 2007, SOSP 2007, Stevenson, Washington, USA, October 14-17, 2007; 01/2007
[show abstract][hide abstract] ABSTRACT: The Border Gateway Protocol (BGP) allows each autonomous system (AS) to select routes to destinations based on semantically rich and locally determined policies. This autonomously exercised policy freedom can cause instability, where unresolvable policy-based disputes in the network result in interdomain route oscillations. Several recent works have established that such instabilities can only be eliminated by enforcing a globally accepted preference ordering on routes (such as shortest path). To resolve this conflict between policy autonomy and system stability, we propose a distributed mechanism that enforces a preference ordering only when disputes resulting in oscillations exist. This preserves policy freedom when possible, and imposes stability when required.
Proceedings of the ACM SIGCOMM 2007 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Kyoto, Japan, August 27-31, 2007; 01/2007
[show abstract][hide abstract] ABSTRACT: The Internet has evolved greatly from its original incarnation. For instance, the vast majority of current Internet usage is data retrieval and service access, whereas the architecture was designed around host-to-host applications such as telnet and ftp. Moreover, the original Internet was a purely transparent carrier of packets, but now the various network stakeholders use middleboxes to improve security and accelerate applications. To adapt to these changes, we propose the Data-Oriented Network Architecture (DONA), which involves a clean-slate redesign of Internet naming and name resolution.