Publications (13)0 Total impact

Conference Paper: Memoryconstrained aggregate computation over data streams
[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we study the problem of efficiently computing multiple aggregation queries over a data stream. In order to share computation, prior proposals have suggested instantiating certain intermediate aggregates which are then used to generate the final answers for input queries. In this work, we make a number of important contributions aimed at improving the execution and generation of query plans containing intermediate aggregates. These include: (1) a different hashing model, which has low eviction rates, and also allows us to accurately estimate the number of evictions, (2) a comprehensive query execution cost model based on these estimates, (3) an efficient greedy heuristic for constructing good lowcost query plans, (4) provably nearoptimal and optimal algorithms for allocating the available memory to aggregates in the query plan when the input data distribution is Zipflike and Uniform, respectively, and (5) a detailed performance study with reallife IP flow data sets, which show that our multiple aggregates computation techniques consistently outperform the bestknown approach.Data Engineering (ICDE), 2011 IEEE 27th International Conference on; 05/2011 
Conference Paper: Joint Routing and Scheduling in Multihop Wireless Networks with Directional Antennas
[Show abstract] [Hide abstract]
ABSTRACT: Longdistance multihop wireless networks have been used in recent years to provide connectivity to rural areas. The salient features of such networks include TDMA channel access, nodes with multiple radios, and pointtopoint longdistance wireless links established using highgain directional antennas mounted on high towers. It has been demonstrated previously that in such network architectures, nodes can transmit concurrently on multiple radios, as well as receive concurrently on multiple radios. However, concurrent transmission on one radio, and reception on another radio causes interference. Under this scheduling constraint, given a set of sourcedestination demand rates, we consider the problem of satisfying the maximum fraction of each demand (also called the maximum concurrent flow problem). We give a novel joint routing and scheduling scheme for this problem, based on linear programming and graph coloring. We analyze our algorithm theoretically and prove that at least 50% of a satisfiable set of demands is satisfied by our algorithm for most practical networks (with maximum node degree at most 5).INFOCOM, 2010 Proceedings IEEE; 04/2010 
Conference Paper: Scalable ContentBased Routing in Pub/Sub Systems
[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we develop a framework for achieving scalable and communicationefficient dissemination of content in pub/sub systems. To maximize communication sharing across subscriptions, our routing framework groups subscriptions based on similarity, and transmits content matching one or more subscriptions in a group over a single dissemination tree for the group. We develop a cost model that uses published content samples in conjunction with the knowledge of consumer subscriptions to estimate the communication cost of a set of routing trees for subscription groups. The problem of computing a communicationoptimal set of routing trees is then formulated as an optimization problem that seeks to find trees with the minimum cost. It turns out that the problem of computing a minimumcost tree for a subscription group is a new generalization of the wellknown Steiner tree problem, and an interesting problem in its own right. We develop an approximation algorithm that uses lowstretch spanning trees to compute a tree whose communication cost is within a polylogarithmic factor of the optimum. We use this to compute trees for various subscription grouping configurations generated using a greedy clustering strategy, and select the one with the lowest cost. Our experimental study demonstrates the effectiveness of our contentaware routing approach compared to traditional routing based on content oblivious spanning trees.INFOCOM 2009, IEEE; 05/2009 
Conference Paper: Detecting Anomalies Using EndtoEnd Path Measurements
[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we propose new "lowoverhead" network monitoring techniques to detect violations of pathlevel QoS guarantees like endtoend delay, loss, etc. Unlike existing path monitoring schemes, our approach does not calculate QoS parameters for all paths. Instead, it monitors QoS values for only a few paths, and exploits the fact that path anomalies are rare and anomalous states are well separated from normal operation, to rule out path QoS violations in most situations. We propose a heuristic to select a small subset of network paths to monitor while ensuring that no QoS violations are missed. Experiments with an ISP topology from the Rocketfuel data set show that our heuristic can deliver almost a 50% decrease in monitoring overhead compared to previous schemes.INFOCOM 2008. The 27th Conference on Computer Communications. IEEE; 05/2008 
Conference Paper: A New Channel Assignment Mechanism for Rural Wireless Mesh Networks
[Show abstract] [Hide abstract]
ABSTRACT: In this paper we present a new channel allocation scheme for IEEE 802.11 based mesh networks with pointto point links, designed for rural areas. Our channel allocation scheme allows continuous fullduplex data transfer on every link in the network. Moreover, we do not require any synchronization across the links as the channel assignment prevents cross link interference. Our approach is simple. We consider any link in the network as made up of two directed edges. To each directed edge at a node, we assign a noninterfering IEEE 802.11 channel so that the set of channels assigned to the outgoing edges is disjoint from channels assigned to the incoming edges. Evaluation of this scheme in a testbed demonstrate throughput gains of between 50  100%, and significantly less endtoend delays, over existing link scheduling/channel allocation protocols (such as 2P [11]) designed for pointtopoint mesh networks. Formally speaking, this channel allocation scheme is equivalent to an edgecoloring problem, that we call the directed edge coloring (DEC) problem. We establish a relationship between this coloring problem and the classical vertex coloring problem, and thus, show that this problem is NPhard. More precisely, we give an algorithm that, given k vertex coloring of a graph can directed edge color it using xi(k) colors, where xi(k) is the smallest integer n such that (lfloorn/2rfloor/n ) ges k.INFOCOM 2008. The 27th Conference on Computer Communications. IEEE; 05/2008 
Conference Paper: Minimum Cost Topology Construction for Rural Wireless Mesh Networks
[Show abstract] [Hide abstract]
ABSTRACT: IEEE 802.11 WiFi equipment based wireless mesh networks have recently been proposed as an inexpensive approach to connect farflung rural areas. Such networks are built using highgain directional antennas that can establish longdistance wireless pointtopoint links. Some nodes in the network (called gateway nodes) are directly connected to the wired internet, and the remaining nodes connect to the gateway(s) using one or more hops. The dominant cost of constructing such a mesh network is the cost of constructing antenna towers at nodes. The cost of a tower depends on its height, which in turn depends on the length of its links and the physical obstructions along those links. We investigate the problem of selecting which links should be established such that all nodes are connected, while the cost of constructing the antenna towers required to establish the selected links is minimized. We show that this problem is NPhard and that a better than O(log n) approximation cannot be expected, where n is the number of vertices in the graph. We then present the first algorithm in the literature, for this problem, with provable performance bounds. More precisely, we present a greedy algorithm that is an O(log n) approximation algorithm for this problem. Finally, through simulations, we compare our approximation algorithm with both the optimal solution, and a naive heuristic.INFOCOM 2008. The 27th Conference on Computer Communications. IEEE; 05/2008 
Conference Paper: Efficient Aggregate Computation over Data Streams
[Show abstract] [Hide abstract]
ABSTRACT: Cisco's NetFlow collector (NFC) is a powerful example of a realworld product that supports multiple aggregate queries over a continuous stream of IP flow records. NFC enables a plethora of network management tasks like traffic demands estimation, application traffic profiling, etc. In this paper, we investigate two computation sharing techniques for enabling streaming applications such as NFC to scale to hundreds of queries. Our first technique instantiates certain intermediate aggregates which are then used to generate the final answers for input queries. Our second technique coalesces the filter conditions of similar queries and uses the coalesced filter to prefilter stream data input to these queries. Using these techniques, we propose a heuristic to compute a good query plan and perform extensive simulations to show that our heuristic delivers a factor of over 3 performance improvement compared to a naive approach.Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on; 05/2008 
Conference Paper: Efficient Constraint Monitoring Using Adaptive Thresholds
[Show abstract] [Hide abstract]
ABSTRACT: Detecting constraint violations in largescale distributed systems has recently attracted plenty of attention from the research community due to its varied applications (security, network monitoring, etc.). Communication efficiency of these systems is a critical concern and determines their practicality. In this paper, we introduce a new set of methods called nonzero slack schemes to implement distributed SUM queries efficiently. We show, both analytically and empirically, that these methods can lead to a considerable reduction in the amount of communication. We propose three adaptive nonzero slack schemes that adapt to changing data distributions; our best scheme is a lightweight reactive scheme that probabilistically adjusts local constraints based on the occurrence of certain events (using only a periodic probability estimation). We conduct an extensive experimental study using reallife and synthetic data sets, and show that our nonzero slack schemes incur significantly less communication overhead compared to the state of the art zero slack scheme (over a 60% savings).Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on; 05/2008 
Conference Paper: Mining (Social) Network Graphs to Detect Random Link Attacks
[Show abstract] [Hide abstract]
ABSTRACT: Modern communication networks are vulnerable to attackers who send unsolicited messages to innocent users, wasting network resources and user time. Some examples of such attacks are spam emails, annoying telemarketing phone calls, viral marketing in social networks, etc. Existing techniques to identify these attacks are tailored to certain specific domains (like email spam filtering), but are not applicable to a majority of other networks. We provide a generic abstraction of such attacks, called the Random Link Attack (RLA), that can be used to describe a large class of attacks in communication networks. In an RLA, the malicious user creates a set of false identities and uses them to communicate with a large, random set of innocent users. We mine the social networking graph extracted from user interactions in the communication network to find RLAs. To the best of our knowledge, this is the first attempt to conceptualize the attack definition, applicable to a variety of communication networks. In this paper, we formally define RLA and show that the problem of finding an RLA is NPcomplete. We also provide two efficient heuristics to mine subgraphs satisfying the RLA property; the first (GREEDY) is based on greedy setexpansion, and the second (TRWALK) on randomized graph traversal. Our experiments with a reallife data set demonstrate the effectiveness of these algorithms.Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on; 05/2008 
Conference Paper: Diagnosing LinkLevel Anomalies Using Passive Probes
[Show abstract] [Hide abstract]
ABSTRACT: In this paper, we develop passive network tomography techniques for inferring linklevel anomalies like excessive loss rates and delay from pathlevel measurements. Our approach involves placing a few passive monitoring devices on strategic links within the network, and then passively monitoring the performance of network paths that pass through those links. In order to keep the monitoring infrastructure and communication costs low, we focus on minimizing (1) the number of passive probe devices deployed, and (2) the set of monitored paths. For mesh topologies, we show that the above two minimization problems are NPhard, and consequently, devise polynomialtime greedy algorithms that achieve a logarithmic approximation factor, which is the best possible for any algorithm. We also consider tree topologies typical of Enterprise networks, and show that while similar NPhardness results hold, constant factor approximation algorithms are possible for such topologies.INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE; 06/2007 
Conference Paper: Routing and Channel Allocation in Rural Wireless Mesh Networks
[Show abstract] [Hide abstract]
ABSTRACT: IEEE 802.11 WiFi equipment based wireless mesh networks have recently been proposed as an inexpensive approach to connect farflung rural areas. Such networks are built using highgain directional antenna that can establish longdistance pointpoint links. In recent work, a new MAC protocol named 2P has been proposed that is suited for the interference pattern within such a network. However, the 2P protocol requires the underlying graph (for each 802.11 channel) to be bipartite. Under the assumption that 2P is the MAC protocol used in the mesh network, we make the following contributions in this paper. Given K noninterfering 802.11 channels, we propose a simple cutbased algorithm to compute K bipartite subgraphs (on each of which the 2P protocol can be run separately). We establish the class of graphs that can thus be completely covered by K bipartite subgraphs. For the remaining set of graphs, we look into the "price" of routing all endtoend demands over only the bipartite subgraphs. We analytically establish what fraction of the max flow of the original meshgraph can be routed over the bipartite subgraphs. Finally we look into the problem of mismatch between the load on a link (as computed by max flow) and its effective capacity under a given channel allocation. We propose heuristics to cluster links with similar loads into the same bipartite graphs (channels) and through comprehensive numerical simulations show that our heuristics come very close to the best possible flow.INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE; 06/2007 
Conference Paper: VillageNet: A lowcost, 802.11based mesh network for rural regions
[Show abstract] [Hide abstract]
ABSTRACT: VillageNet is a wireless mesh network that aims to provide lowcost broadband Internet access for rural regions. The cost of building the network is kept low by using offtheshelf IEEE 802.11 equipment and optimizing the network topology to minimize cost. In this paper we describe the overall operation of VillageNet and discuss two fundamental problems in building such a network. Nodes in VillageNet communicate using longdistance pointtopoint wireless links that are established using highgain directional antenna. VillageNet uses the 2P MAC protocol [?], that is suited for the interference pattern within such a network. However, the 2P protocol requires the underlying mesh graph (for each 802.11 channel) to be bipartite. Thus, if K channels are available, then an important consideration is how to select K bipartite subgraphs to activate, such that the demands of the nodes are best met. We formally pose this problem and present some initial results. Second, we observe that the dominant cost of constructing such a mesh network is the cost of constructing antenna towers at nodes. The cost of a tower depends on its height, which in turn depends on the length of its links, and the physical obstructions along those links. Thus to minimize cost, we pose the problem of deciding which links should be established, such that all villages are connected and the cost of constructing antenna towers to establish the selected links is minimized.Communication Systems Software and Middleware, 2007. COMSWARE 2007. 2nd International Conference on; 02/2007  [Show abstract] [Hide abstract]
ABSTRACT: VillageNet is a new wireless mesh networking technology that provides lowcost broadband Internet access for wide regions. It targets the rural market around the world, where large populations live but paying capacities are low. VillageNet offers a lowcost, high performance alternative to traditional wireline/cellular technologies that have prohibitively expensive deployment costs. VillageNet connects villages in a mesh using longdistance wireless links. The cost of building the network is kept low by using offtheshelf Institute of Electrical and Electronics Engineers (IEEE) 802.11 equipment and optimizing the network topology to minimize cost. In this paper, we describe the overall operation and architecture of the VillageNet network. We also describe the various technical challenges surrounding channel allocation, link scheduling, and topology construction for these networks and present some initial results for these problems. Finally, we outline several interesting open issues for building rural wireless mesh networks. © 2007 AlcatelLucent.Bell Labs Technical Journal. 01/2007; 12:119131.
Publication Stats
118  Citations  
Top Journals
Institutions

2009

Yahoo! Labs
Sunnyvale, California, United States


2008

Massachusetts Institute of Technology
Cambridge, Massachusetts, United States 
Alcatel Lucent
Lutetia Parisorum, ÎledeFrance, France


2007

Stanford University
Palo Alto, California, United States
