Publications (23)10.64 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: What does a typical road network look like? Existing generative models tend to focus on one aspect to the exclusion of others. We introduce the generalpurpose \emph{quadtree model} and analyze its shortest paths and maximum flow.Computing Research Repository  CORR. 08/2010; 
Conference Paper: Storage Capacity of Labeled Graphs.
[Show abstract] [Hide abstract]
ABSTRACT: We consider the question of how much information can be stored by labeling the vertices of a connected undirected graph G using a constantsize set of labels, when isomorphic labelings are not distinguishable. An exact informationtheoretic bound is easily obtained by counting the number of isomorphism classes of labelings of G, which we call the informationtheoretic capacity of the graph. More interesting is the effective capacity of members of some class of graphs, the number of states distinguishable by a Turing machine that uses the labeled graph itself in place of the usual linear tape. We show that the effective capacity equals the informationtheoretic capacity up to constant factors for trees, random graphs with polynomial edge probabilities, and boundeddegree graphs.Stabilization, Safety, and Security of Distributed Systems  12th International Symposium, SSS 2010, New York, NY, USA, September 2022, 2010. Proceedings; 01/2010 
Conference Paper: Lower Bounds on Learning Random Structures with Statistical Queries.
[Show abstract] [Hide abstract]
ABSTRACT: Abstract We show that random DNF formulas, random logdepth decision trees and random determin istic finite acceptors cannot be weakly learned with a polynomial number of statistical queries with respect to an arbitrary distribution.Algorithmic Learning Theory, 21st International Conference, ALT 2010, Canberra, Australia, October 68, 2010. Proceedings; 01/2010 
Conference Paper: Lowcontention data structures.
[Show abstract] [Hide abstract]
ABSTRACT: We consider the problem of minimizing contention in static dictionary data structures, where the contention on each cell is measured by the expected number of probes to that cell given an input that is chosen from a distribution that is not known to the query algorithm (but that may be known when the data structure is built). When all positive queries are equally probable, and similarly all negative queries are equally probable, we show that it is possible to construct a data structure using linear space s, a constant number of queries, and with contention O(1/s) on each cell, corresponding to a nearlyflat load distribution. All of these quantities are asymptotically optimal. For arbitrary query distributions, the lack of knowledge of the query distribution by the query algorithm prevents perfect load leveling in this case: we present a lower bound, based on VCdimension, that shows that for a wide range of data structure problems, achieving contention even within a polylogarithmic factor of optimal requires a cellprobe complexity of Ω(log log n).SPAA 2010: Proceedings of the 22nd Annual ACM Symposium on Parallelism in Algorithms and Architectures, Thira, Santorini, Greece, June 1315, 2010; 01/2010  [Show abstract] [Hide abstract]
ABSTRACT: We show that 2 is the minimum VC dimension of a concept class whose kfold union has VC dimension Ω(klogk)Ω(klogk).Information Processing Letters 01/2009; 109:12321234. · 0.49 Impact Factor  Journal of Machine Learning Research 01/2009; 10:18811911. · 3.42 Impact Factor
 [Show abstract] [Hide abstract]
ABSTRACT: We describe and analyze a 3state oneway population protocol to compute approximate majority in the model in which pairs of agents are drawn uniformly at random to interact. Given an initial configuration of x’s, y’s and blanks that contains at least one nonblank, the goal is for the agents to reach consensus on one of the values x or y. Additionally, the value chosen should be the majority nonblank initial value, provided it exceeds the minority by a sufficient margin. We prove that with high probability n agents reach consensus in O(n log n) interactions and the value chosen is the majority provided that its initial margin is at least w(Ön logn){\omega(\sqrt{n} \,{\rm log}\, n)}. This protocol has the additional property of tolerating Byzantine behavior in o(Ön){o(\sqrt{n})} of the agents, making it the first known population protocol that tolerates Byzantine agents.Distributed Computing 06/2008; 21(2):87102. · 0.63 Impact Factor 
Article: Twoenqueuer queue in Common2
[Show abstract] [Hide abstract]
ABSTRACT: The question of whether all shared objects with consensus number 2 belong to Common2, the set of objects that can be implemented in a waitfree manner by any type of consensus number 2, was first posed by Herlihy. In the absence of general results, several researchers have obtained implementations for restrictedconcurrency versions of FIFO queues. We present the first Common2 algorithm for a queue with two enqueuers and any number of dequeuers.06/2008; 
Conference Paper: Learning Acyclic Probabilistic Circuits Using Test Paths.
[Show abstract] [Hide abstract]
ABSTRACT: We define a model of learning probabilistic acyclic circuits using value injection queries, in which an arbitrary subset of wires is set to fixed values, and the value on the single output wire is observed. We adapt the approach of using test paths from the Circuit Builder algorithm (AACW06) to show that there is a polynomial time algorithm that uses valueinjectionqueriestolearnBooleanprobabilis tic circuits of constant fanin and log depth. In the process, we discover that test paths fail utterly for circuits over alphabets of size greater than two and establish upper and lower bounds on the atten uation factor for general and transitively reduced Boolean probabilistic circuits of test paths versus general experiments. To overcome the limitations of test paths for nonBoolean alphabets, we intro duce function injection queries, which allow the symbols on a wire to be mapped to other symbols rather than just to themselves or constants.21st Annual Conference on Learning Theory  COLT 2008, Helsinki, Finland, July 912, 2008; 01/2008  [Show abstract] [Hide abstract]
ABSTRACT: For a rooted graph G, let EV (G;p) be the expected number of ver tices reachable from the root when each edge has an independent probability p of operating successfully. We examine combinatorial properties of this polyno mial, proving that G is kedge connected i EV 0(G;1) = ··· = EV k 1(G;1) = 0. We find bounds on the first and second derivatives of EV (G;p); applications yield characterizations of rooted paths and cycles in terms of the polynomial. We prove reconstruction results for rooted trees and a negative result con cerning reconstruction of more complicated rooted graphs. We conclude by proving the norm of the largest root of EV (G;p) in Q(i) gives a sharp lower bound on the number of vertices of G.SIAM J. Discrete Math. 01/2008; 22:776785.  [Show abstract] [Hide abstract]
ABSTRACT: For a rooted graph G, let EVb(G;p) be the expected number of vertices reachable from the root when each edge has an independent probability p of operating successfully. We determine the expected value of EVb(G;p) for random trees, and include a connection to unrooted trees. We also consider rooted digraphs, computing the expected value of a random orientation of a rooted graph G in terms of EVb(G;p). We consider optimal location of the root vertex for the class of grid graphs, and we also briefly discuss a polynomial that incorporates vertex failure.Discrete Applied Mathematics 01/2008; 156:746756. · 0.72 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We describe and analyze a 3state oneway population protocol for approximate majority in the model in which pairs of agents are drawn uniformly at random to interact. Given an initial configuration of x’s, y’s and blanks that contains at least one nonblank, the goal is for the agents to reach consensus on one of the values x or y. Additionally, the value chosen should be the majority nonblank initial value, provided it exceeds the minority by a sufficient margin. We prove that with high probability n agents reach consensus in O(n logn) interactions and the value chosen is the majority provided that its initial margin is at least w(Ö{n logn})\omega(\sqrt{n \log n}) . This protocol has the additional property of tolerating Byzantine behavior in o(Ön)o(\sqrt{n}) of the agents, making it the first known population protocol that tolerates Byzantine agents. Turning to the register machine construction from[2], we apply the 3state approximate majority protocol and other techniques to speed up the perstep parallel time overhead of the simulation from O(log4 n) to O(log2 n). To increase the robustness of the phase clock at the heart of the register machine, we describe a consensus version of the phase clock and present encouraging simulation results; its analysis remains an open problem.09/2007: pages 2032; 
Article: The VC dimension of kfold union.
[Show abstract] [Hide abstract]
ABSTRACT: The known O(dklogk) bound on the VC dimension of kfold unions or intersections of a given concept class with VC dimension d is shown to be asymptotically tight.Information Processing Letters 01/2007; 101:181184. · 0.49 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Fast algorithms are presented for performing computations in a probabilistic population model. This is a variant of the standard population protocol model—in which finitestate agents interact in pairs under the control of an adversary scheduler—where all pairs are equally likely to be chosen for each interaction. It is shown that when a unique leader agent is provided in the initial population, the population can simulate a virtual register machine in which standard arithmetic operations like comparison, addition, subtraction, and multiplication and division by constants can be simulated in O(n log4 n) interactions with high probability. Applications include a reduction of the cost of computing a semilinear predicate to O(n log4 n) interactions from the previously bestknown bound of O(n 2 logn) interactions and simulation of a LOGSPACE Turing machine using the same O(n log4 n) interactions per step. These bounds on interactions translate into O(log4 n) time per step in a natural parallel model in which each agent participates in an expected Θ(1) interactions per time unit. The central method is the extensive use of epidemics to propagate information from and to the leader, combined with an epidemicbased phase clock used to detect when these epidemics are likely to be complete.10/2006: pages 6175;  [Show abstract] [Hide abstract]
ABSTRACT: We consider the model of population protocols introduced by Angluin et al., in which anonymous finitestate agents stably compute a predicate of the multiset of their inputs via twoway interactions in the allpairs family of communication networks. We prove that all predicates stably computable in this model (and certain generalizations of it) are semilinear, answering a central open question about the power of the model. Removing the assumption of twoway interaction, we also consider several variants of the model in which agents communicate by anonymous messagepassing where the recipient of each message is chosen by an adversary and the sender is not identified to the recipient. These oneway models are distinguished by whether messages are delivered immediately or after a delay, whether a sender can record that it has sent a message, and whether a recipient can queue incoming messages, refusing to accept new messages until it has had a chance to send out messages of its own. We characterize the classes of predicates stably computable in each of these oneway models using natural subclasses of the semilinear predicates.Distributed Computing 09/2006; · 0.63 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: The greedoid Tutte polynomial of a tree is equivalent to a generating function that encodes information about the number of subtrees with II internal (nonleaf) edges and LL leaf edges, for all I and L. We prove that this information does not uniquely determine the tree T by constructing an infinite family of pairs of nonisomorphic caterpillars, each pair having identical subtree data. This disproves conjectures of [S. Chaudhary, G. Gordon, Tutte polynomials for trees, J. Graph Theory 15 (1991) 317–331] and [G. Gordon, E. McDonnell, D. Orloff, N. Yung, On the Tutte polynomial of a tree, Congr. Numer. 108 (1995) 141–151] and contrasts with the situation for rooted trees, where this data completely determines the rooted tree.Discrete Mathematics 01/2006; 306:827830. · 0.58 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Fast algorithms are presented for performing computations in a probabilistic population model. This is a variant of the standard population protocol model, in which finitestate agents interact in pairs under the control of an adversary scheduler, where all pairs are equally likely to be chosen for each interaction. It is shown that when a unique leader agent is provided in the initial population, the population can simulate a virtual register machine with high probability in which standard arithmetic operations like comparison, addition, subtraction, and multiplication and division by constants can be simulated in O(n log5 n) interactions using a simple register representation or in O(n log2 n) interactions using a more sophisticated representation that requires an extra O(n log O(1) n)interaction initialization step. The central method is the extensive use of epidemics to propagate information from and to the leader, combined with an epidemicbased phase clock used to detect when these epidemics are likely to be complete. Applications include a reduction of the cost of computing a semilinear predicate to O(n log5 n) interactions from the previously bestknown bound of O(n 2 log n) interactions and simulation of a LOGSPACE Turing machine using O(n log2 n) interactions per step after an initial O(n log O(1) n)interaction startup phase. These bounds on interactions translate into polylogarithmic time per step in a natural parallel model in which each agent participates in an expected Θ(1) interactions per time unit. Open problems are discussed, together with simulation results that suggest the possibility of removing the initialleader assumption.Distributed Computing 01/2006; 21(3):183199. · 0.63 Impact Factor 
Conference Paper: Stably computable predicates are semilinear.
[Show abstract] [Hide abstract]
ABSTRACT: We consider the model of population protocols intro duced by Angluin et al. (2), in which anonymous finitestate agents stably compute a predicate of their inputs via two way interactions in the allpairs family of communication networks. We prove that all predicates stably computable in this model (and certain generalizations of it) are semilin ear, answering a central open question about the power of the model.Proceedings of the TwentyFifth Annual ACM Symposium on Principles of Distributed Computing, PODC 2006, Denver, CO, USA, July 2326, 2006; 01/2006 
Conference Paper: On the Power of Anonymous OneWay Communication.
[Show abstract] [Hide abstract]
ABSTRACT: We consider a population of anonymous processes communicating via anonymous messagepassing, where the recipient of each message is chosen by an adversary and the sender is not identified to the recipient. Even with unbounded message sizes and process states, such a system can compute only limited predicates on inputs held by the processes. In the finitestate case, we show how the exact strength of the model depends critically on design choices that are irrelevant in the unboundedstate case, such as whether messages are delivered immediately or after a delay, whether a sender can record that it has sent a message, and whether a recipient can queue incoming messages, refusing to accept new messages until it has had a chance to send out messages of its own. These results may have implications for the design of distributed systems where processor power is severely limited, as in sensor networks.Principles of Distributed Systems, 9th International Conference, OPODIS 2005, Pisa, Italy, December 1214, 2005, Revised Selected Papers; 01/2005  [Show abstract] [Hide abstract]
ABSTRACT: Transactional memory (TM) systems seek to increase scalability, reduce programming complexity, and overcome the various semantic problems associated with locks. Software TM proposals run on stock processors and provide substantial flexibility in policy, but incur significant overhead for data versioning and validation in the face of conflicting transactions. Hardware TM proposals have the advantage of speed, but are typically highly ambitious, embed significant amounts of policy in silicon, and provide no clear migration path for software that must also run on legacy machines. We advocate an intermediate approach, in which hardware is used to accelerate a TM implementation controlled fundamentally by software. We present a system, RTM, that embodies this approach. It consists of a novel transactional MESI (TMESI) protocol and accompanying TM software. TMESI eliminates the key overheads of data copying, garbage collection, and validation without introducing any global consensus algorithm in the cache coherence protocol, or any new bus transactions. The only change to the snooping interface is a “threatened” signal analogous to the existing “shared” signal. By leaving policy to software, RTM allows us to experiment with a wide variety of policies for contention management, deadlock and livelock avoidance, data granularity, nesting, and virtualization.
Publication Stats
349  Citations  
10.64  Total Impact Points  
Top Journals
Institutions

2010

Brown University
 Department of Computer Science
Providence, Rhode Island, United States 
Yale University
 Department of Computer Science
New Haven, Connecticut, United States


2006–2008

Princeton University
 Department of Computer Science
Princeton, New Jersey, United States 
University of Rochester
Rochester, New York, United States
