Boon Thau Loo's research while affiliated with University of Pennsylvania and other places

Publications (226)

Preprint
Full-text available
This paper presents DeCon, a declarative programming language for implementing smart contracts and specifying contract-level properties. Driven by the observation that smart contract operations and contract-level properties can be naturally expressed as relational constraints, DeCon models each smart contract as a set of relational tables that stor...
Preprint
Full-text available
Cloud data centers are rapidly evolving. At the same time, large-scale data analytics applications require non-trivial performance tuning that is often specific to the applications, workloads, and data center infrastructure. We propose TeShu, which makes network shuffling an extensible unified service layer common to all data analytics. Since an op...
Preprint
Full-text available
Byzantine fault-tolerant protocols cover a broad spectrum of design dimensions from environmental setting on communication topology, to more technical features such as commitment strategy and even fundamental social choice related properties like order fairness. Designing and building BFT protocols remains a laborious task despite of years of inten...
Article
Utilizing big-data analytics for crowdfunding platforms (e.g., AngelList and Crunchbase) and social media sites (e.g., Facebook and Twitter), this study investigates the impact of social media marketing on the start-up fundraising success through the lens of social capital theory. The results show that cognitive, structural, and relational dimensio...
Chapter
Debugging imperative network programs is a difficult task for operators as it requires understanding various network modules and complicated data structures. For this purpose, this paper presents an automated technique for repairing network programs with respect to unit tests. Given as input a faulty network program and a set of unit tests, our app...
Preprint
Full-text available
Debugging imperative network programs is a challenging task for developers because understanding various network modules and complicated data structures is typically time-consuming. To address the challenge, this paper presents an automated technique for repairing network programs from unit tests. Specifically, given as input a faulty network progr...
Preprint
Full-text available
Today's large-scale data management systems need to address distributed applications' confidentiality and scalability requirements among a set of collaborative enterprises. In this paper, we present Qanaat, a scalable multi-enterprise permissioned blockchain system that guarantees confidentiality. Qanaat consists of multiple enterprises where each...
Chapter
Full-text available
Writing classification rules to identify interesting network traffic is a time-consuming and error-prone task. Learning-based classification systems automatically extract such rules from positive and negative traffic examples. However, due to limitations in the representation of network traffic and the learning strategy, these systems lack both exp...
Preprint
Full-text available
The next frontier for the Internet leading by innovations in mobile computing, in particular, 5G, together with blockchains' transparency, immutability, provenance, and authenticity, indicates the potentials of running a new generation of applications on the mobile internet. A 5G-enabled blockchain system is structured as a hierarchy and needs to d...
Preprint
Writing classification rules to identify malicious network traffic is a time-consuming and error-prone task. Learning-based classification systems automatically extract such rules from positive and negative traffic examples. However, due to limitations in the representation of network traffic and the learning strategy, these systems lack both expre...
Preprint
Software Defined Networking has unfolded a new area of opportunity in distributed networking and intelligent networks. There has been a great interest in performing machine learning in distributed setting, exploiting the abstraction of SDN which makes it easier to write complex ML queries on standard control plane. However, most of the research has...
Article
Full-text available
Resource disaggregation is a new architecture for data centers in which resources like memory and storage are decoupled from the CPU, managed independently, and connected through a high-speed network. Recent work has shown that although disaggregated data centers (DDCs) provide operational benefits, applications running on DDCs experience degraded...
Book
This Festschrift is in honor of Prof. Andre Scedrov at the University of Pennsylvania. Scedrov has laid the foundations for a number of now well-established domains in mathematics and computer science including Proof Theory, Logic in Computer Science, Foundations in Computer Security, and Linguistics. This combination of breadth and penetrating ori...
Conference Paper
We revisit the gap between what distributed systems need from the transport layer and what protocols in wide deployment provide. Such a gap complicates the implementation of distributed systems and impacts their performance. We introduce Tunable Multicast Communication (TMC), an abstraction that allows developers to easily specialize communication...
Conference Paper
This paper presents GraphRex, an efficient, robust, scalable, and easy-to-program framework for graph processing on datacenter infrastructure. To users, GraphRex presents a declarative, Datalog-like interface that is natural and expressive. Underneath, it compiles those queries into efficient implementations. A key technical contribution of GraphRe...
Preprint
In spite of much progress and many advances, cost-effective, high-quality video delivery over the internet remains elusive. To address this ongoing challenge, we propose Sunstar, a solution that leverages simultaneous downloads from multiple servers to preserve video quality. The novelty in Sunstar is not so much in its use of multiple servers but...
Conference Paper
Full-text available
"Breaking up" software into a dataflow network of tasks can improve availability and performance by exploiting the flexibility of the resulting graph, more granular resource use, hardware concurrency and modern interconnects. Decomposing legacy systems in this manner is difficult and ad hoc however, raising such challenges as weaker consistency and...
Article
Recent emergence of software-defined networks offers an opportunity to design domain-specific programming abstractions aimed at network operators. In this paper, we propose scenario-based programming, a framework that allows network operators to program network policies by describing example behaviors in representative scenarios. Given these scenar...
Conference Paper
Microbursts can degrade application performance in datacenters by causing increased latency, jitter and packet loss. The detection of microbursts and identification of the contributing flows is the first step towards mitigating this problem. Unfortunately, microbursts are unpredictable and typically last for 10's or 100's of μs and the high line ra...
Conference Paper
This paper presents a novel integrated platform for the automatic detection and mitigation of denial-of-service (DoS) attacks in networked systems. Recently, these attacks have evolved from simple flooding at the network layer to targeted, application-specific asymmetric attacks. Because of this trend, existing techniques---which rely primarily on...
Conference Paper
Full-text available
Failing network links are usually disabled, and packets are routed around them until the links are repaired. While it is often possible to utilize some of a failing link's capacity, losing what remains of a link's capacity is typically deemed preferable to the erratic effect that unreliable links can have on application-level behavior. We describe...
Conference Paper
In recent years, there has been a proliferation in network domain-specific languages (DSL). These languages enable us to exploit the programmability of these networks, while still providing correctness guarantees through verification and analysis of DSLs. However, none of these DSLs have received widespread adoption. First these new languages requi...
Article
Full-text available
Network failures continue to plague datacenter operators as their symptoms may not have direct correlation with where or why they occur. We introduce 007, a lightweight, always-on diagnosis application that can find problematic links and also pinpoint problems for each TCP connection. 007 is completely contained within the end host. During its two...
Conference Paper
A key ingredient to a startup's success is its ability to raise funding at an early stage. Crowdfunding has emerged as an exciting new mechanism for connecting startups with potentially thousands of investors. Nonetheless, little is known about its effectiveness, nor the strategies that entrepreneurs should adopt in order to maximize their rate of...
Conference Paper
Graph analytics systems have gained significant popularity due to the prevalence of graph data. Many of these systems are designed to run in a shared-nothing architecture whereby a cluster of machines can process a large graph in parallel. In more recent proposals, others have argued that a single-machine system can achieve better performance and/o...
Conference Paper
We propose a demonstration of DeDoS, a platform for mitigating asymmetric DDoS attacks. These attacks are particularly challenging since attackers using limited resources can exhaust the resources of even well-provisioned servers. DeDoS resolves this by splitting monolithic software stacks into separable components called minimum splittable units (...
Conference Paper
In network management today, dynamic updates are required for traffic engineering and for timely response to security threats. Decisions for such updates are based on monitoring network traffic to compute numerical quantities based on a variety of network and application-level performance metrics. Today's state-of-the-art tools lack programming abs...
Article
Full-text available
When debugging an SDN application, diagnosing the problem is merely the first step: the operator must still find a fix that solves the problem, without causing new problems elsewhere. However, most existing debuggers focus exclusively on diagnosis and offer the network operator little or no help with finding an effective fix. Finding a suitable fix...
Conference Paper
Network provenance, which records the execution history of network events as meta-data, is becoming increasingly important for network accountability and failure diagnosis. For example, network provenance may be used to trace the path that a message traversed in a network, or to reveal how a particular routing entry was derived and the parties invo...
Conference Paper
Today, network operators are increasingly playing the role of part-time detectives: they must routinely diagnose intricate problems and malfunctions, e.g., routing or performance issues, and they must often perform forensic investigations of past misbehavior, e.g., intrusions or cybercrimes. However, the current Internet architecture offers little...
Conference Paper
This paper presents SplitStack, an architecture targeted at mitigating asymmetric DDoS attacks. These attacks are particularly challenging, since attackers can use a limited amount of resources to trigger exhaustion of a particular type of system resource on the server side. SplitStack resolves this by splitting the monolithic stack into many separ...
Conference Paper
Today, root cause analysis of failures in data centers is mostly done through manual inspection. More often than not, cus- tomers blame the network as the culprit. However, other components of the system might have caused these failures. To troubleshoot, huge volumes of data are collected over the entire data center. Correlating such large volumes...
Conference Paper
In this paper, we propose a new approach to diagnosing problems in complex distributed systems. Our approach is based on the insight that many of the trickiest problems are anomalies. For instance, in a network, problems often affect only a small fraction of the traffic (e.g., perhaps a certain subnet), or they only manifest infrequently. Thus, it...
Conference Paper
As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the alg...
Conference Paper
Crowdfunding is a recent financing phenomenon that is gaining wide popularity as a means for startups to raise seed funding for their companies. This paper presents our initial results at understanding this phenomenon using an exploratory data driven approach. We have developed a big data platform for collecting and managing data from multiple sour...
Conference Paper
Recent emergence of software-defined networks offers an opportunity to design domain-specific programming abstractions aimed at network operators. In this paper, we propose scenario-based programming, a framework that allows network operators to program network policies by describing representative example behaviors. Given these scenarios, our synt...
Article
The performance of networks that use the Internet Protocol is sensitive to precise configuration of many low-level parameters on each network device. These settings govern the action of dynamic routing protocols, which direct the flow of traffic; in order to ensure that these dynamic protocols all converge to produce some 'optimal' flow, each param...
Conference Paper
Full-text available
In this paper, we propose a new approach to diagnosing problems in complex networks. Our approach is based on the insight that many of the trickiest problems are anomalies -- they affect only a small fraction of the traffic (e.g., perhaps a certain subnet), or they only manifest infrequently. Thus, it is quite common for the network operator to hav...
Conference Paper
Full-text available
When debugging an SDN application, diagnosing the problem is merely the first step -- the operator must still implement a solution that works, and that does not cause new problems elsewhere. However, most existing SDN debuggers focus exclusively on identifying the problem and offer the network operator little or no help with finding an effective fi...
Article
The Internet, as it stands today, is highly vulnerable to attacks. However, little has been done to understand and verify the formal security guarantees of proposed secure inter-domain routing protocols, such as Secure BGP (S-BGP). In this paper, we develop a sound program logic for SANDLog-a declarative specification language for secure routing pr...
Article
Full-text available
This paper presents MTor, a low-latency anonymous group communication system. We construct MTor as an extension to Tor, allowing the construction of multi-source multicast trees on top of the existing Tor infrastructure. MTor does not depend on an external service to broker the group communication, and avoids central points of failure and trust. MT...
Conference Paper
Networks are complex systems that unfortunately are ridden with errors. Such errors can lead to disruption of services, which may have grave consequences. Verification of networks is key to eliminating errors and building robust networks. In this paper, we propose an approach to verify networks using declarative networking, where networks are speci...
Conference Paper
Cloud today is evolving towards multi-datacenter deployment, with each datacenter serving customers in different geographical areas. The independence between datacenters, however, prohibits effective inter-datacenter resource sharing and flexible management of the infrastructure. In this paper, we propose WL2, a Software-Defined Networking (SDN) so...
Article
Full-text available
Cloud computing offers a new, attractive option to customers for quickly provisioning any size Hadoop cluster, consuming resources as a service, executing their MapReduce workload, and then paying for the time these resources were used. One of the open questions in such environments is the right choice of resources (and their amount) a user should...
Article
The paper seeks to broaden our understanding of MPTCP and focuses on the impact that initial sub-path selection can have on performance. Using empirical data, it demonstrates that which sub-path is chosen to start an MPTCP connection can have unintuitive consequences. Using numerical analysis and a model-driven investigation, the paper elucidates a...
Article
This paper presents Scalanytics, a declarative platform that supports high-performance application layer analysis of network traffic. Scalanytics uses (1) stateful network packet processing techniques for extracting application layer data from network packets, (2) a declarative rule-based language called Analog for compactly specifying analysis pip...
Article
The emergence of programmable interfaces to network controllers offers network operators the flexibility to implement a variety of policies. We propose NetEgg, a programming framework that allows a network operator to specify the desired functionality using example behaviors. Our synthesis algorithm automatically infers the state that needs to be m...
Article
As declarative query processing techniques expand in scope --- to the Web, data streams, network routers, and cloud platforms --- there is an increasing need for adaptive query processing techniques that can re-plan in the presence of failures or unanticipated performance changes. A status update on the data distributions or the compute nodes may h...
Article
Full-text available
In MapReduce environments, many applications have to achieve different performance goals for producing time relevant results. One of typical user questions is how to estimate the completion time of a MapReduce program as a function of varying input dataset sizes and given cluster resources. In this work, we offer a novel performance evaluation fram...
Article
When debugging a distributed system, it is sometimes necessary to explain the absence of an event - for instance, why a certain route is not available, or why a certain packet did not arrive. Existing debuggers offer some support for explaining the presence of events, usually by providing the equivalent of a backtrace in conventional debuggers, but...
Article
Full-text available
NEBULA is a proposal for a Future Internet Architecture. It is based on the assumptions that: (1) cloud computing will comprise an increasing fraction of the application workload offered to an Internet, and (2) that access to cloud computing resources will demand new architectural features from a network. Features that we have identified include de...
Conference Paper
With increasing deployment of Multipath TCP (MPTCP) in multihoming and data enter scenarios, there is a need to understand how its performance is affected in practice-both by traditional factors such as RTT measurements, and by new multipath-specific considerations such as sub flow selection. We carried out an initial but comprehensive study using...
Conference Paper
Full-text available
Cloud computing offers a new, attractive option to customers for provisioning a suitable size Hadoop cluster, consuming resources as a service, executing the MapReduce workload, and paying for the time these resources were used. One of the open questions in such environments is the choice and the amount of resources that a user should lease from th...
Conference Paper
Full-text available
Cloud computing enables a user to quickly provision any size Hadoop cluster, execute a given MapReduce workload, and then pay for the time the resources were used. Typically, there is a choice of different types of VM instances in the Cloud (e.g., small, medium, or large EC2 instances). The capacity differences of the offered VMs are reflected in V...
Conference Paper
The Border Gateway Protocol (BGP) is the single inter-domain routing protocol that enables network operators within each autonomous system (AS) to influence routing decisions by independently setting local policies on route filtering and selection. This independence leads to fragile networking and makes analysis of policy configurations very comple...
Article
This paper presents the design and implementation of Application-Aware Anonymity (A3), an extensible platform for rapidly prototyping and evaluating anonymity protocols on the Internet. A3 supports the development of highly tunable anonymous protocols that enable applications to tailor their anonymity properties and performance characteristics acco...
Conference Paper
When debugging an SDN, it is sometimes necessary to explain the absence of an event: why a certain rule was not installed, or why a certain packet did not arrive. Existing SDN debuggers offer some support for explaining the presence of events, usually by providing the equivalent of a "backtrace" in conventional debuggers, but they are not very good...
Conference Paper
With the tremendous growth of the Internet and the emerging software-defined networks, there is an increasing need for rigorous and scalable network management methods and tool support. This paper proposes a synthesis approach for managing software-defined networks. We formulate the construction of network control logic as a reactive synthesis prob...
Conference Paper
Mapping virtual networks to physical networks under bandwidth constraints is a key computational problem for the management of data centers. Recently proposed heuristic strategies for this problem work efficiently, but are not guaranteed to always find an allocation even when one exists. Given that the bandwidth allocation problem is NP-complete, a...
Article
Full-text available
Many applications associated with live business intelligence are written as complex data analysis programs defined by directed acyclic graphs of MapReduce jobs, for example, using Pig, Hive, or Scope frameworks. An increasing number of these applications have additional requirements for completion time guarantees. In this article, we consider the p...