Publications (27)
Understanding the dynamics of evolving social or infrastructure networks is a challenge in applied areas such as epidemiology, viral marketing, or urban planning. During the past decade, data has been collected on such networks but has yet to be fully analyzed. We propose to use information on the dynamics of the data to find stable partitions of the network into groups. For that purpose, we introduce a timedependent, dynamic version of the facility location problem, that includes a switching cost when a client's assignment changes from one facility to another. This might provide a better representation of an evolving network, emphasizing the abrupt change of relationships between subjects rather than the continuous evolution of the underlying network. We show that in realistic examples this model yields indeed better fitting solutions than optimizing every snapshot independently. We present an $O(\log nT)$approximation algorithm and a matching hardness result, where $n$ is the number of clients and $T$ the number of time steps. We also give an other algorithms with approximation ratio $O(\log nT)$ for the variant where one pays at each time step (leasing) for each open facility.
ABSTRACT: We consider variants of the metric kcenter problem. Imagine that you must choose locations for k firehouses in a city so as to minimize the maximum distance of a house from the nearest firehouse. An instance is specified by a graph with arbitrary nonnegative edge lengths, a set of vertices that can serve as firehouses (i.e., centers) and a set of vertices that represent houses. For general graphs, this problem is exactly equivalent to the metric kcenter problem, which is APXhard. We give a polynomialtime bicriteria approximation scheme when the input graph is a planar graph. We also give polynomialtime bicriteria approximation schemes for several generalizations: if, instead of all houses, we wish to cover a specified proportion of the houses; if the candidate locations for firehouses have rental costs and we wish to minimize not the number of firehouses but the sum of their rental costs; and if the input graph is not planar but is of bounded genus. Copyright 
Conference Paper: Lineartime algorithms for max flow and multiplesource shortest paths in unitweight planar graphs
We give simple lineartime algorithms for two problems in planar graphs: max stflow in directed graphs with unit capacities, and multiplesource shortest paths in undirected graphs with unit lengths.
ABSTRACT: Recent years have seen the development of several different systems for software transactional memory (STM). Most either employ locks in the underlying implementation or depend on threadsafe generalpurpose garbage collection to collect stale data and metadata. We consider the design of lowoverhead, obstructionfree software transactional memory for nongarbagecollected languages. Our design eliminates dynamic allocation of transactional metadata and colocates data that are separate in other systems, thereby reducing the expected number of cache misses on the commoncase code path, while preserving nonblocking progress and requiring no atomic instructions other than singleword load, store, and compareandswap (or loadlinked/storeconditional). We also employ a simple, epochbased storage management system and introduce a novel conservative mechanism to make reader transactions visible to writers without inducing additional metadata copying or dynamic allocation. Experimental results show throughput significantly higher than that of existing nonblocking STM systems, and highlight significant applicationspecific differences among conflict detection and validation strategies.  [Show abstract] [Hide abstract]
ABSTRACT: We give an $O(n \log^3 n)$ approximation scheme for Steiner forest in planar graphs, improving on the previous approximation scheme for this problem, which runs in $O(n^{f(\epsilon)})$ time. 
Conference Paper: Lower Bounds on Learning Random Structures with Statistical Queries
We show that random DNF formulas, random logdepth decision trees and random determin istic finite acceptors cannot be weakly learned with a polynomial number of statistical queries with respect to an arbitrary distribution. 
Conference Paper: Storage Capacity of Labeled Graphs.
We consider the question of how much information can be stored by labeling the vertices of a connected undirected graph G using a constantsize set of labels, when isomorphic labelings are not distinguishable. An exact informationtheoretic bound is easily obtained by counting the number of isomorphism classes of labelings of G, which we call the informationtheoretic capacity of the graph. More interesting is the effective capacity of members of some class of graphs, the number of states distinguishable by a Turing machine that uses the labeled graph itself in place of the usual linear tape. We show that the effective capacity equals the informationtheoretic capacity up to constant factors for trees, random graphs with polynomial edge probabilities, and boundeddegree graphs.
ABSTRACT: What does a typical road network look like? Existing generative models tend to focus on one aspect to the exclusion of others. We introduce the generalpurpose \emph{quadtree model} and analyze its shortest paths and maximum flow. 
Conference Paper: LowContention Data Structures
We consider the problem of minimizing contention in static dictionary data structures, where the contention on each cell is measured by the expected number of probes to that cell given an input that is chosen from a distribution that is not known to the query algorithm (but that may be known when the data structure is built). When all positive queries are equally probable, and similarly all negative queries are equally probable, we show that it is possible to construct a data structure using linear space s, a constant number of queries, and with contention O(1/s) on each cell, corresponding to a nearlyflat load distribution. All of these quantities are asymptotically optimal. For arbitrary query distributions, the lack of knowledge of the query distribution by the query algorithm prevents perfect load leveling in this case: we present a lower bound, based on VCdimension, that shows that for a wide range of data structure problems, achieving contention even within a polylogarithmic factor of optimal requires a cellprobe complexity of Ω(log log n).
We show that 2 is the minimum VC dimension of a concept class whose kfold union has VC dimension Ω(klogk)Ω(klogk). 
Article: Twoenqueuer queue in Common2
The question of whether all shared objects with consensus number 2 belong to Common2, the set of objects that can be implemented in a waitfree manner by any type of consensus number 2, was first posed by Herlihy. In the absence of general results, several researchers have obtained implementations for restrictedconcurrency versions of FIFO queues. We present the first Common2 algorithm for a queue with two enqueuers and any number of dequeuers.
ABSTRACT: For a rooted graph G, let EV (G;p) be the expected number of ver tices reachable from the root when each edge has an independent probability p of operating successfully. We examine combinatorial properties of this polyno mial, proving that G is kedge connected i EV 0(G;1) = ··· = EV k 1(G;1) = 0. We find bounds on the first and second derivatives of EV (G;p); applications yield characterizations of rooted paths and cycles in terms of the polynomial. We prove reconstruction results for rooted trees and a negative result con cerning reconstruction of more complicated rooted graphs. We conclude by proving the norm of the largest root of EV (G;p) in Q(i) gives a sharp lower bound on the number of vertices of G.  [Show abstract] [Hide abstract]
ABSTRACT: For a rooted graph G, let EVb(G;p) be the expected number of vertices reachable from the root when each edge has an independent probability p of operating successfully. We determine the expected value of EVb(G;p) for random trees, and include a connection to unrooted trees. We also consider rooted digraphs, computing the expected value of a random orientation of a rooted graph G in terms of EVb(G;p). We consider optimal location of the root vertex for the class of grid graphs, and we also briefly discuss a polynomial that incorporates vertex failure. 
Conference Paper: Learning Acyclic Probabilistic Circuits Using Test Paths
We define a model of learning probabilistic acyclic circuits using value injection queries, in which an arbitrary subset of wires is set to fixed values, and the value on the single output wire is observed. We adapt the approach of using test paths from the Circuit Builder algorithm (AACW06) to show that there is a polynomial time algorithm that uses valueinjectionqueriestolearnBooleanprobabilis tic circuits of constant fanin and log depth. In the process, we discover that test paths fail utterly for circuits over alphabets of size greater than two and establish upper and lower bounds on the atten uation factor for general and transitively reduced Boolean probabilistic circuits of test paths versus general experiments. To overcome the limitations of test paths for nonBoolean alphabets, we intro duce function injection queries, which allow the symbols on a wire to be mapped to other symbols rather than just to themselves or constants.
ABSTRACT: We describe and analyze a 3state oneway population protocol for approximate majority in the model in which pairs of agents are drawn uniformly at random to interact. Given an initial configuration of x’s, y’s and blanks that contains at least one nonblank, the goal is for the agents to reach consensus on one of the values x or y. Additionally, the value chosen should be the majority nonblank initial value, provided it exceeds the minority by a sufficient margin. We prove that with high probability n agents reach consensus in O(n logn) interactions and the value chosen is the majority provided that its initial margin is at least w(Ö{n logn})\omega(\sqrt{n \log n}) . This protocol has the additional property of tolerating Byzantine behavior in o(Ön)o(\sqrt{n}) of the agents, making it the first known population protocol that tolerates Byzantine agents. Turning to the register machine construction from[2], we apply the 3state approximate majority protocol and other techniques to speed up the perstep parallel time overhead of the simulation from O(log4 n) to O(log2 n). To increase the robustness of the phase clock at the heart of the register machine, we describe a consensus version of the phase clock and present encouraging simulation results; its analysis remains an open problem. 
Article: The VC dimension of kfold union
The known O(dklogk) bound on the VC dimension of kfold unions or intersections of a given concept class with VC dimension d is shown to be asymptotically tight.
ABSTRACT: Fast algorithms are presented for performing computations in a probabilistic population model. This is a variant of the standard population protocol model—in which finitestate agents interact in pairs under the control of an adversary scheduler—where all pairs are equally likely to be chosen for each interaction. It is shown that when a unique leader agent is provided in the initial population, the population can simulate a virtual register machine in which standard arithmetic operations like comparison, addition, subtraction, and multiplication and division by constants can be simulated in O(n log4 n) interactions with high probability. Applications include a reduction of the cost of computing a semilinear predicate to O(n log4 n) interactions from the previously bestknown bound of O(n 2 logn) interactions and simulation of a LOGSPACE Turing machine using the same O(n log4 n) interactions per step. These bounds on interactions translate into O(log4 n) time per step in a natural parallel model in which each agent participates in an expected Θ(1) interactions per time unit. The central method is the extensive use of epidemics to propagate information from and to the leader, combined with an epidemicbased phase clock used to detect when these epidemics are likely to be complete.  [Show abstract] [Hide abstract]
ABSTRACT: We consider the model of population protocols introduced by Angluin et al., in which anonymous finitestate agents stably compute a predicate of the multiset of their inputs via twoway interactions in the allpairs family of communication networks. We prove that all predicates stably computable in this model (and certain generalizations of it) are semilinear, answering a central open question about the power of the model. Removing the assumption of twoway interaction, we also consider several variants of the model in which agents communicate by anonymous messagepassing where the recipient of each message is chosen by an adversary and the sender is not identified to the recipient. These oneway models are distinguished by whether messages are delivered immediately or after a delay, whether a sender can record that it has sent a message, and whether a recipient can queue incoming messages, refusing to accept new messages until it has had a chance to send out messages of its own. We characterize the classes of predicates stably computable in each of these oneway models using natural subclasses of the semilinear predicates.
