Boon Thau Loo's research while affiliated with University of Pennsylvania and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (226)
This paper presents DeCon, a declarative programming language for implementing smart contracts and specifying contract-level properties. Driven by the observation that smart contract operations and contract-level properties can be naturally expressed as relational constraints, DeCon models each smart contract as a set of relational tables that stor...
Cloud data centers are rapidly evolving. At the same time, large-scale data analytics applications require non-trivial performance tuning that is often specific to the applications, workloads, and data center infrastructure. We propose TeShu, which makes network shuffling an extensible unified service layer common to all data analytics. Since an op...
Byzantine fault-tolerant protocols cover a broad spectrum of design dimensions from environmental setting on communication topology, to more technical features such as commitment strategy and even fundamental social choice related properties like order fairness. Designing and building BFT protocols remains a laborious task despite of years of inten...
Utilizing big-data analytics for crowdfunding platforms (e.g., AngelList and Crunchbase) and social media sites (e.g., Facebook and Twitter), this study investigates the impact of social media marketing on the start-up fundraising success through the lens of social capital theory. The results show that cognitive, structural, and relational dimensio...
Debugging imperative network programs is a difficult task for operators as it requires understanding various network modules and complicated data structures. For this purpose, this paper presents an automated technique for repairing network programs with respect to unit tests. Given as input a faulty network program and a set of unit tests, our app...
Debugging imperative network programs is a challenging task for developers because understanding various network modules and complicated data structures is typically time-consuming. To address the challenge, this paper presents an automated technique for repairing network programs from unit tests. Specifically, given as input a faulty network progr...
Today's large-scale data management systems need to address distributed applications' confidentiality and scalability requirements among a set of collaborative enterprises. In this paper, we present Qanaat, a scalable multi-enterprise permissioned blockchain system that guarantees confidentiality. Qanaat consists of multiple enterprises where each...
Writing classification rules to identify interesting network traffic is a time-consuming and error-prone task. Learning-based classification systems automatically extract such rules from positive and negative traffic examples. However, due to limitations in the representation of network traffic and the learning strategy, these systems lack both exp...
The next frontier for the Internet leading by innovations in mobile computing, in particular, 5G, together with blockchains' transparency, immutability, provenance, and authenticity, indicates the potentials of running a new generation of applications on the mobile internet. A 5G-enabled blockchain system is structured as a hierarchy and needs to d...
Writing classification rules to identify malicious network traffic is a time-consuming and error-prone task. Learning-based classification systems automatically extract such rules from positive and negative traffic examples. However, due to limitations in the representation of network traffic and the learning strategy, these systems lack both expre...
Software Defined Networking has unfolded a new area of opportunity in distributed networking and intelligent networks. There has been a great interest in performing machine learning in distributed setting, exploiting the abstraction of SDN which makes it easier to write complex ML queries on standard control plane. However, most of the research has...
Resource disaggregation is a new architecture for data centers in which resources like memory and storage are decoupled from the CPU, managed independently, and connected through a high-speed network. Recent work has shown that although disaggregated data centers (DDCs) provide operational benefits, applications running on DDCs experience degraded...
This Festschrift is in honor of Prof. Andre Scedrov at the University of Pennsylvania. Scedrov has laid the foundations for a number of now well-established domains in mathematics and computer science including Proof Theory, Logic in Computer Science, Foundations in Computer Security, and Linguistics.
This combination of breadth and penetrating ori...
We revisit the gap between what distributed systems need from the transport layer and what protocols in wide deployment provide. Such a gap complicates the implementation of distributed systems and impacts their performance. We introduce Tunable Multicast Communication (TMC), an abstraction that allows developers to easily specialize communication...
This paper presents GraphRex, an efficient, robust, scalable, and easy-to-program framework for graph processing on datacenter infrastructure. To users, GraphRex presents a declarative, Datalog-like interface that is natural and expressive. Underneath, it compiles those queries into efficient implementations. A key technical contribution of GraphRe...
In spite of much progress and many advances, cost-effective, high-quality video delivery over the internet remains elusive. To address this ongoing challenge, we propose Sunstar, a solution that leverages simultaneous downloads from multiple servers to preserve video quality. The novelty in Sunstar is not so much in its use of multiple servers but...
"Breaking up" software into a dataflow network of tasks can improve availability and performance by exploiting the flexibility of the resulting graph, more granular resource use, hardware concurrency and modern interconnects. Decomposing legacy systems in this manner is difficult and ad hoc however, raising such challenges as weaker consistency and...
Recent emergence of software-defined networks offers an opportunity to design domain-specific programming abstractions aimed at network operators. In this paper, we propose scenario-based programming, a framework that allows network operators to program network policies by describing example behaviors in representative scenarios. Given these scenar...
Microbursts can degrade application performance in datacenters by causing increased latency, jitter and packet loss. The detection of microbursts and identification of the contributing flows is the first step towards mitigating this problem. Unfortunately, microbursts are unpredictable and typically last for 10's or 100's of μs and the high line ra...
This paper presents a novel integrated platform for the automatic detection and mitigation of denial-of-service (DoS) attacks in networked systems. Recently, these attacks have evolved from simple flooding at the network layer to targeted, application-specific asymmetric attacks. Because of this trend, existing techniques---which rely primarily on...
Failing network links are usually disabled, and packets are routed around them until the links are repaired. While it is often possible to utilize some of a failing link's capacity, losing what remains of a link's capacity is typically deemed preferable to the erratic effect that unreliable links can have on application-level behavior.
We describe...
In recent years, there has been a proliferation in network domain-specific languages (DSL). These languages enable us to exploit the programmability of these networks, while still providing correctness guarantees through verification and analysis of DSLs. However, none of these DSLs have received widespread adoption. First these new languages requi...
Network failures continue to plague datacenter operators as their symptoms may not have direct correlation with where or why they occur. We introduce 007, a lightweight, always-on diagnosis application that can find problematic links and also pinpoint problems for each TCP connection. 007 is completely contained within the end host. During its two...
A key ingredient to a startup's success is its ability to raise funding at an early stage. Crowdfunding has emerged as an exciting new mechanism for connecting startups with potentially thousands of investors. Nonetheless, little is known about its effectiveness, nor the strategies that entrepreneurs should adopt in order to maximize their rate of...
Graph analytics systems have gained significant popularity due to the prevalence of graph data. Many of these systems are designed to run in a shared-nothing architecture whereby a cluster of machines can process a large graph in parallel. In more recent proposals, others have argued that a single-machine system can achieve better performance and/o...
We propose a demonstration of DeDoS, a platform for mitigating asymmetric DDoS attacks. These attacks are particularly challenging since attackers using limited resources can exhaust the resources of even well-provisioned servers. DeDoS resolves this by splitting monolithic software stacks into separable components called minimum splittable units (...
In network management today, dynamic updates are required for traffic engineering and for timely response to security threats. Decisions for such updates are based on monitoring network traffic to compute numerical quantities based on a variety of network and application-level performance metrics. Today's state-of-the-art tools lack programming abs...
When debugging an SDN application, diagnosing the problem is merely the first step: the operator must still find a fix that solves the problem, without causing new problems elsewhere. However, most existing debuggers focus exclusively on diagnosis and offer the network operator little or no help with finding an effective fix. Finding a suitable fix...
Network provenance, which records the execution history of network events as meta-data, is becoming increasingly important for network accountability and failure diagnosis. For example, network provenance may be used to trace the path that a message traversed in a network, or to reveal how a particular routing entry was derived and the parties invo...
Today, network operators are increasingly playing the role of part-time detectives: they must routinely diagnose intricate problems and malfunctions, e.g., routing or performance issues, and they must often perform forensic investigations of past misbehavior, e.g., intrusions or cybercrimes. However, the current Internet architecture offers little...
This paper presents SplitStack, an architecture targeted at mitigating asymmetric DDoS attacks. These attacks are particularly challenging, since attackers can use a limited amount of resources to trigger exhaustion of a particular type of system resource on the server side. SplitStack resolves this by splitting the monolithic stack into many separ...
Today, root cause analysis of failures in data centers is mostly done through manual inspection. More often than not, cus- tomers blame the network as the culprit. However, other components of the system might have caused these failures. To troubleshoot, huge volumes of data are collected over the entire data center. Correlating such large volumes...
In this paper, we propose a new approach to diagnosing problems in complex distributed systems. Our approach is based on the insight that many of the trickiest problems are anomalies. For instance, in a network, problems often affect only a small fraction of the traffic (e.g., perhaps a certain subnet), or they only manifest infrequently. Thus, it...
As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the alg...
Crowdfunding is a recent financing phenomenon that is gaining wide popularity as a means for startups to raise seed funding for their companies. This paper presents our initial results at understanding this phenomenon using an exploratory data driven approach. We have developed a big data platform for collecting and managing data from multiple sour...
Recent emergence of software-defined networks offers an opportunity to design domain-specific programming abstractions aimed at network operators. In this paper, we propose scenario-based programming, a framework that allows network operators to program network policies by describing representative example behaviors. Given these scenarios, our synt...
The performance of networks that use the Internet Protocol is sensitive to
precise configuration of many low-level parameters on each network device.
These settings govern the action of dynamic routing protocols, which direct the
flow of traffic; in order to ensure that these dynamic protocols all converge
to produce some 'optimal' flow, each param...
In this paper, we propose a new approach to diagnosing problems in complex networks. Our approach is based on the insight that many of the trickiest problems are anomalies -- they affect only a small fraction of the traffic (e.g., perhaps a certain subnet), or they only manifest infrequently. Thus, it is quite common for the network operator to hav...
When debugging an SDN application, diagnosing the problem is merely the first step -- the operator must still implement a solution that works, and that does not cause new problems elsewhere. However, most existing SDN debuggers focus exclusively on identifying the problem and offer the network operator little or no help with finding an effective fi...
The Internet, as it stands today, is highly vulnerable to attacks. However,
little has been done to understand and verify the formal security guarantees of
proposed secure inter-domain routing protocols, such as Secure BGP (S-BGP). In
this paper, we develop a sound program logic for SANDLog-a declarative
specification language for secure routing pr...
This paper presents MTor, a low-latency anonymous group communication system. We construct MTor as an extension to Tor, allowing the construction of multi-source multicast trees on top of the existing Tor infrastructure. MTor does not depend on an external service to broker the group communication, and avoids central points of failure and trust. MT...
Networks are complex systems that unfortunately are ridden with errors. Such errors can lead to disruption of services, which may have grave consequences. Verification of networks is key to eliminating errors and building robust networks. In this paper, we propose an approach to verify networks using declarative networking, where networks are speci...
Cloud today is evolving towards multi-datacenter deployment, with each datacenter serving customers in different geographical areas. The independence between datacenters, however, prohibits effective inter-datacenter resource sharing and flexible management of the infrastructure. In this paper, we propose WL2, a Software-Defined Networking (SDN) so...
Cloud computing offers a new, attractive option to customers for quickly provisioning any size Hadoop cluster, consuming resources as a service, executing their MapReduce workload, and then paying for the time these resources were used. One of the open questions in such environments is the right choice of resources (and their amount) a user should...
The paper seeks to broaden our understanding of MPTCP and focuses on the impact that initial sub-path selection can have on performance. Using empirical data, it demonstrates that which sub-path is chosen to start an MPTCP connection can have unintuitive consequences. Using numerical analysis and a model-driven investigation, the paper elucidates a...
This paper presents Scalanytics, a declarative platform that supports high-performance application layer analysis of network traffic. Scalanytics uses (1) stateful network packet processing techniques for extracting application layer data from network packets, (2) a declarative rule-based language called Analog for compactly specifying analysis pip...
The emergence of programmable interfaces to network controllers offers network operators the flexibility to implement a variety of policies. We propose NetEgg, a programming framework that allows a network operator to specify the desired functionality using example behaviors. Our synthesis algorithm automatically infers the state that needs to be m...
As declarative query processing techniques expand in scope --- to the Web,
data streams, network routers, and cloud platforms --- there is an increasing
need for adaptive query processing techniques that can re-plan in the presence
of failures or unanticipated performance changes. A status update on the data
distributions or the compute nodes may h...
In MapReduce environments, many applications have to achieve different performance goals for producing time relevant results. One of typical user questions is how to estimate the completion time of a MapReduce program as a function of varying input dataset sizes and given cluster resources. In this work, we offer a novel performance evaluation fram...
When debugging a distributed system, it is sometimes necessary to explain the absence of an event - for instance, why a certain route is not available, or why a certain packet did not arrive. Existing debuggers offer some support for explaining the presence of events, usually by providing the equivalent of a backtrace in conventional debuggers, but...
NEBULA is a proposal for a Future Internet Architecture. It is based on the assumptions that: (1) cloud computing will comprise an increasing fraction of the application workload offered to an Internet, and (2) that access to cloud computing resources will demand new architectural features from a network. Features that we have identified include de...
With increasing deployment of Multipath TCP (MPTCP) in multihoming and data enter scenarios, there is a need to understand how its performance is affected in practice-both by traditional factors such as RTT measurements, and by new multipath-specific considerations such as sub flow selection. We carried out an initial but comprehensive study using...
Cloud computing offers a new, attractive option to customers for provisioning a suitable size Hadoop cluster, consuming resources as a service, executing the MapReduce workload, and paying for the time these resources were used. One of the open questions in such environments is the choice and the amount of resources that a user should lease from th...
Cloud computing enables a user to quickly provision any size Hadoop cluster, execute a given MapReduce workload, and then pay for the time the resources were used. Typically, there is a choice of different types of VM instances in the Cloud (e.g., small, medium, or large EC2 instances). The capacity differences of the offered VMs are reflected in V...
The Border Gateway Protocol (BGP) is the single inter-domain routing protocol that enables network operators within each autonomous system (AS) to influence routing decisions by independently setting local policies on route filtering and selection. This independence leads to fragile networking and makes analysis of policy configurations very comple...
This paper presents the design and implementation of Application-Aware Anonymity (A3), an extensible platform for rapidly prototyping and evaluating anonymity protocols on the Internet. A3 supports the development of highly tunable anonymous protocols that enable applications to tailor their anonymity properties and performance characteristics acco...
When debugging an SDN, it is sometimes necessary to explain the absence of an event: why a certain rule was not installed, or why a certain packet did not arrive. Existing SDN debuggers offer some support for explaining the presence of events, usually by providing the equivalent of a "backtrace" in conventional debuggers, but they are not very good...
With the tremendous growth of the Internet and the emerging software-defined networks, there is an increasing need for rigorous and scalable network management methods and tool support. This paper proposes a synthesis approach for managing software-defined networks. We formulate the construction of network control logic as a reactive synthesis prob...
Mapping virtual networks to physical networks under bandwidth constraints is a key computational problem for the management of data centers. Recently proposed heuristic strategies for this problem work efficiently, but are not guaranteed to always find an allocation even when one exists. Given that the bandwidth allocation problem is NP-complete, a...
Many applications associated with live business intelligence are written as complex data analysis programs defined by directed acyclic graphs of MapReduce jobs, for example, using Pig, Hive, or Scope frameworks. An increasing number of these applications have additional requirements for completion time guarantees. In this article, we consider the p...