Yoram Moses

Yoram Moses
Technion - Israel Institute of Technology | technion · Laboratory for VLSI (with EE)

About

133
Publications
13,959
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,262
Citations
Citations since 2016
36 Research Items
1736 Citations
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300

Publications

Publications (133)
Article
Full-text available
Current-day data centers and high-volume cloud services employ a broad set of heterogeneous servers. In such settings, client requests typically arrive at multiple entry points, and dispatching them to servers is an urgent distributed systems problem. This paper presents an efficient solution to the load balancing problem in such systems that impro...
Preprint
This paper investigates the transfer of information in fault-prone synchronous systems using null messages. The notion of an $f$-resilient message block is defined to capture the fundamental communication pattern for knowledge transfer. This pattern may involve null messages in addition to explicit messages, and hence, it provides a fault-tolerant...
Article
Full-text available
The unbeatability of a consensus protocol, introduced by Halpern et al. (SIAM J Comput 31:838–865, 2001), is a stronger notion of optimality than the accepted notion of early stopping protocols. Using a novel knowledge-based analysis, this paper derives the first explicit unbeatable consensus protocols in the literature, for the standard synchronou...
Article
With the rapid increase in the size and volume of cloud services and data centers, architectures with multiple job dispatchers are quickly becoming the norm. Load balancing is a key element of such systems. Nevertheless, current solutions to load balancing in such systems admit a paradoxical behavior in which more accurate information regarding ser...
Article
Full-text available
Modular methods that transform Byzantine consensus protocols for the synchronous model into ones that are fast and communication efficient in failure-free executions are presented. Small and short protocol segments called layers are custom designed to act as a highly efficient preliminary stage that solves Consensus if no failures occur. When compo...
Preprint
Current-day data centers and high-volume cloud services employ a broad set of heterogeneous servers. In such settings, client requests typically arrive at multiple entry points, and dispatching them to servers is an urgent distributed systems problem. This paper presents an efficient solution to the load balancing problem in such systems that impro...
Preprint
Full-text available
Lower bounds and impossibility results in distributed computing are both intellectually challenging and practically important. Hundreds if not thousands of proofs appear in the literature, but surprisingly, the vast majority of them apply to deterministic algorithms only. Probabilistic distributed problems have been around for at least four decades...
Preprint
With the rapid increase in the size and volume of cloud services and data centers, architectures with multiple job dispatchers are quickly becoming the norm. Load balancing is a key element of such systems. Nevertheless, current solutions to load balancing in such systems admit a paradoxical behavior in which more accurate information regarding ser...
Preprint
Whereas deterministic protocols are typically guaranteed to obtain particular goals of interest, probabilistic protocols typically provide only probabilistic guarantees. This paper initiates an investigation of the interdependence between actions and subjective beliefs of agents in a probabilistic setting. In particular, we study what probabilistic...
Article
The cost of communication is a substantial factor affecting the scalability of many distributed applications. Every message sent can incur a cost in storage, computation, energy, and bandwidth. Consequently, reducing the communication costs of distributed applications is highly desirable. The best way to reduce message costs is by communicating wit...
Preprint
Modular methods to transform Byzantine consensus protocols into ones that are fast and communication efficient in the common cases are presented. Small and short protocol segments called layers are custom designed to optimize performance in the common case. When composed with a Byzantine consensus protocol of choice, they allow considerable control...
Preprint
The work described in this paper explores the use of time and synchronized clocks in centrally-managed and Software Defined Networks (SDNs). One of the main goals of this work is to analyze use cases in which explicit use of time is beneficial. Both theoretical and practical aspects of timed coordination and synchronized clocks in centralized envir...
Conference Paper
The cost of communication is a substantial factor affecting the scalability of many distributed applications. Every message sent can incur a cost in storage, computation, energy and bandwidth. Consequently, reducing the communication costs of distributed applications is highly desirable. The best way to reduce message costs is by communicating with...
Preprint
The cost of communication is a substantial factor affecting the scalability of many distributed applications. Every message sent can incur a cost in storage, computation, energy and bandwidth. Consequently, reducing the communication costs of distributed applications is highly desirable. The best way to reduce message costs is by communicating with...
Conference Paper
Even in the absence of clocks, time bounds on the duration of actions enable the use of time for distributed coordination. This paper initiates an investigation of coordination in such a setting. A new communication structure called a zigzag pattern is introduced, and is shown to guarantee bounds on the relative timing of events in this clockless m...
Article
Even in the absence of clocks, time bounds on the duration of actions enable the use of time for distributed coordination. This paper initiates an investigation of coordination in such a setting. A new communication structure called a zigzag pattern is introduced, and shown to guarantee bounds on the relative timing of events in this clockless mode...
Article
Full-text available
Characterizations of Nash equilibrium, correlated equilibrium, and rationalizability in terms of common knowledge of rationality are well known. Analogous characterizations of sequential equilibrium, (trembling hand) perfect equilibrium, and quasi-perfect equilibrium using results of Halpern [2009].
Conference Paper
Full-text available
Consensus is the most basic agreement problem encountered in fault-tolerant distributed computing: each process proposes a value and non-faulty processes must agree on the same value, which has to be one of the proposed values. While this problem is impossible to solve in asynchronous systems prone to process crash failures, it can be solved in syn...
Article
Network configuration and policy updates occur frequently, and must be performed in a way that minimizes transient effects caused by intermediate states of the network. It has been shown that accurate time can be used for coordinating network-wide updates, thereby reducing temporary inconsistencies. However, this approach presents a great challenge...
Article
We introduce ReversePTP, a clock synchronization scheme for software-defined networks (SDNs). ReversePTP is based on the Precision Time Protocol (PTP), but is conceptually reversed; in ReversePTP, all nodes (switches) in the network distribute timing information to a single software-based central node (the SDN controller), which tracks the state of...
Conference Paper
Full-text available
The set consensus problem has played an important role in the study of distributed systems for over two decades. Indeed, the search for lower bounds and impossibility results for this problem spawned the topological approach to distributed computing, which has given rise to new techniques in the design and analysis of protocols. The design of effic...
Conference Paper
This paper introduces OneClock, a generic approach for using time in networked applications. OneClock provides two basic time-triggered primitives: the ability to schedule an operation at a remote host or device, and the ability to receive feedback about the time at which an event occurred or an operation was executed at a remote host or device. We...
Conference Paper
With the rise of Software Defined Networks (SDN), there is growing interest in dynamic and centralized traffic engineering, where decisions about forwarding paths are taken dynamically from a network-wide perspective. Frequent path reconfiguration can significantly improve the network performance, but should be handled with care, so as to minimize...
Conference Paper
This paper presents the case for Data Plane Timestamping (DPT). We argue that in the unique environment of Software-Defined Networks (SDN), attaching a timestamp to the header of all packets is a powerful feature that can be leveraged by various diverse SDN applications. We analyze three key use cases that demonstrate the advantages of using DPT, a...
Conference Paper
Various applications require the use of accurate time. However, public cloud providers do not currently offer accurate time as a cloud service. We study the level of precision that can be achieved in current public cloud environments, and make the case for time-as-a-service, which can considerably improve the precision of timekeeping in the cloud.
Conference Paper
We study the behavior of end-to-end packet delay in public cloud networks. This work presents preliminary results from large-scale measurements in two public cloud networks, Amazon’s AWS and Microsoft’s Azure. We analyze the measurements both in the time domain and in the frequency domain, and present new insights into the behavior of network delay...
Article
Network updates such as policy and routing changes occur frequently in Software Defined Networks (SDN). Updates should be performed consistently, preventing temporary disruptions, and should require as little overhead as possible. Scalability is increasingly becoming an essential requirement in SDN. In this paper we propose to use time-triggered ne...
Article
Full-text available
This document defines a capability-based extension to the Network Configuration Protocol (NETCONF) that allows time-triggered configuration and management operations. This extension allows NETCONF clients to invoke configuration updates according to scheduled times and allows NETCONF servers to attach timestamps to the data they send to NETCONF cli...
Technical Report
Full-text available
This paper presents the case for Data Plane Timestamping (DPT). We argue that in the unique environment of Software-Defined Networks (SDN), attaching a timestamp to the header of all packets is a powerful feature that can be leveraged by various diverse SDN applications. We analyze three key use cases that demonstrate the advantages of using DPT, a...
Conference Paper
Full-text available
Network updates such as policy and routing changes occur frequently in Software Defined Networks (SDN). Updates should be performed consistently, preventing temporary disruptions, and should require as little overhead as possible. Scalability is increasingly becoming an essential requirement in SDN. In this paper we propose to use time-triggered ne...
Technical Report
Full-text available
In recent years, there has been growing interest in dynamic and centralized traffic engineering, where decisions about forwarding paths are taken from a network-wide perspective, based on the dynamic state of the network. Frequent path reconfiguration can significantly improve the network performance, but should be handled with care, so as to minim...
Conference Paper
Full-text available
Accurate time can be a useful tool in Software Defined Networks (SDN), allowing to coordinate network updates and topology changes, and to timestamp events and notifications. Moreover, accurate time is used in various environments in which software defined networking is being considered, making accurate time distribution an essential feature of SDN...
Conference Paper
Full-text available
We introduce ReversePTP, a novel approach to clock synchronization in Software Defined Networks (SDN). ReversePTP is based on the Precision Time Protocol (PTP), but is conceptually reversed; in ReversePTP all nodes (switches) in the network distribute timing information to a single node, the controller, that tracks the state of all the clocks in th...
Article
The coordination of a sequence of actions, to be performed in a linear temporal order in a distributed system, is studied. While in asynchronous message-passing systems such ordering of events requires the construction of message chains based on Lamport's happened-before relation, this is no longer true in the presence of time bounds on message del...
Conference Paper
Full-text available
The usage of accurate time to schedule updates in software defined networks was recently proposed in [1]; time can be a powerful tool for applying network updates in a relatively simple manner and with a very brief period of inconsistency during the update. In the current paper we introduce the flow-swapping scenario, which demonstrates the necessi...
Article
We show how game-theoretic solution concepts such as Nash equilibrium, correlated equilibrium, rationalizability, and sequential equilibrium can be given a uniform definition in terms of a knowledge-based program with counterfactual semantics. In a precise sense, this program can be viewed as providing a procedural characterization of rationality.
Article
This paper studies the interaction between knowledge, time and coordination in systems in which timing information is available. Necessary conditions are given for the causal structure in coordination problems consisting of orchestrating a set of actions in a manner that satisfies a variety of temporal ordering assumptions. Results are obtained in...
Conference Paper
Full-text available
Network configuration updates are a routine necessity, and must be performed in a way that minimizes transient effects caused by inter-mediate states of the network. This challenge is especially critical in Software Defined Networks, where the control plane is man-aged by a logically centralized controller, and configuration up-dates occur frequent...
Technical Report
Full-text available
Software Defined Networking (SDN) defines a network architecture in which the control plane is managed by a logically centralized controller, and thus configuration updates occur frequently. We have recently introduced an approach that uses time-based configuration updates, allowing to simplify complex update procedures and to minimize transient ef...
Conference Paper
A minor change to the standard epistemic logical language, replacing K i with K 〈i,t〉 where t is an explicit time instance, gives rise to a generalized and more expressive form of knowledge and common knowledge operators. We investigate the communication structures that are necessary for such generalized epistemic states to arise, and the inter-age...
Conference Paper
We study several variants of coordinated consensus in dynamic networks. We assume a synchronous model, where the communication graph for each round is chosen by a worst-case adversary. The network topology is always connected, but can change completely from one round to the next. The model captures mobile and wireless networks, where communication...
Conference Paper
Full-text available
Decision tasks require that nonfaulty processes make decisions based on their input values. Simultaneous decision tasks require that nonfaulty processes decide in the same round. Most decision tasks have known worst-case lower bounds. Most also have known worst-case optimal protocols that halt in the number of rounds given by the worst-case lower b...
Conference Paper
This paper studies the role that known bounds on message transmission times in a computer network play on the evolution of the epistemic state over time. A connection to cones of causal influence analogous to, and more general than, light cones is presented. Focusing on lower bounds on message transmission times, an analysis is presented of how kno...
Article
The effect of upper bounds on message delivery times in a computer network upon the dynamics of knowledge gain is investigated. Recent work has identified centipedes and brooms—causal structures that combine message chains with time bound information—as necessary conditions for knowledge gain and common knowledge gain, respectively. This paper show...
Article
Lamport's happened-before relation provides a starting point for an enquiry into causal relations in synchronous systems. We define the ordered response problem, a natural coordination task. By analyzing solutions to this task we arrive at the Centipede Theorem, that gives a concise characterization of synchronous causality.
Article
Reconfiguration means changing the set of processes executing a distributed system. We explain several methods for reconfiguring a system implemented using the state-machine approach, including some new ones. We discuss the relation between these methods ...
Conference Paper
The general omissions failure model, in which a faulty process may omit both to send and to receive messages is inherently more complex than the more popular sending omissions model. This fact is exemplified in tasks involving simultaneous decisions, such as the simultaneous consensus (SC) problem. While efficient polynomial protocols for SC that a...
Conference Paper
Full-text available
Consider a fully connected network where up to $t$ processes may crash, and all processes start in an arbitrary memory state. The self-stabilizing firing squad problem consists of eventually guaranteeing simultaneous response to an external input. This is modeled by requiring that the non-crashed processes "fire" simultaneously if some correct proc...
Conference Paper
Full-text available
A continuous consensus (CC) protocol maintains for each process i at each time k an up-to-date core M_i[k] of information about the past, so that the cores at all processes are guaranteed to be identical. This is a generalization of simultaneous consensus that provides processes with the ability to perform simultaneously coordinated actions, and sa...
Article
Fekete and Lynch have proved that reliable end-to-end communication is impossible for lossy FIFO channels without messages containing header information. The results of Wang and Zuck show that, in non-FIFO models with duplication or loss, reliable end-to-end communication is impossible unless the number of different packet types is greater than the...
Article
Full-text available
Bacharach and Cave independently generalized Aumann's celebrated agreement theorem to the case of decision functions. Roughly speaking, they showed that once two like-minded agents reach common knowledge of the actions each of them intends to perform, they will perform identical actions. This theorem is proved for decision functions that satisfy a...
Article
Full-text available
This paper introduces the continuous consensus problem, in which a core M[k] of information is continuously maintained at all correct sites of the system. All local copies of the core must be identical at all times k, and every interesting event should eventually enter the core. The continuous consensus problem is studied in synchronous systems wit...
Conference Paper
Full-text available
Continuous consensus (CC) is the problem of maintaining an identical and up-to-date core of information about the past at all correct processes in the system [1]. This is a primitive that supports simultaneous coordination among processes, and eliminates the need of issuing separate instances of consensus for different tasks. Recent work has presen...
Article
Continuous consensus (CC) is the problem of maintaining up-to-date and identical copies of a “core” of information about the past at all correct processes in the system (Mizrahi and Moses, 2008 [6]). This is a primitive that supports simultaneous coordination among processes, and eliminates the need for issuing separate instances of consensus for d...
Conference Paper
Full-text available
Fault-tolerant systems often require a means by which independent processes or processors can arrive at an exact mutual agreement of some kind. The work announced in this note studies the continuous consensus problem, which is a general tool for enabling actions that are performed at the same time at different sites of the system to be consistent w...
Article
A probabilistic algorithm is presented for finding correspondences across multiple images in systems with large numbers of cameras and considerable overlap. The algorithm employs the theory of random graphs to provide an efficient probabilistic algorithm that performs Wide-baseline Stereo (WBS) comparisons on a small number of image pairs, and then...
Article
A rigorous framework for analyzing safe composition of distributed programs is presented. It facilitates specifying notions of safe sequential execution of distributed programs in various models of communication. A notion of sealing is defined, where if a program P is immediately followed by a program Q that seals P then P will be —it will execute...
Article
We show how solution concepts in games such as Nash equilibrium, correlated equilibrium, rationalizability, and sequential equilibrium can be given a uniform definition in terms of \emph{knowledge-based programs}. Intuitively, all solution concepts are implementations of two knowledge-based programs, one appropriate for games represented in normal...
Conference Paper
This paper provides a proof of correctness for the celebrated Minimum Spanning Tree protocol of Gallager, Humblet and Spira [GHS83]. Both the protocol and the quest for a natural correctness proof have had considerable impact on the literature concerning network protocols and verification. We present an invariance proof that is based on a new inter...
Conference Paper
The fundamental question considered in this paper is when program Q, if executed immediately after program P, is guaranteed not to interfere with P and be safe from interference by P. If a message sent by one of these programs is received by the other, it may affect and modify the other’s execution. The notion of communication closed layers (CCLs)...
Conference Paper
Ideal communication channels in asynchronous systems are reliable, deliver messages in FIFO order, and do not deliver spurious or duplicate messages. A message vocabulary of size two (i.e., single-bit messages) suffices to encode and transmit messages of arbitrary finite length over such channels. This note proves that single-bit messages are insuf...
Conference Paper
We present a probabilistic algorithm for finding correspon- dences across multiple images. The algorithm runs in a distributed set- ting, where each camera is attached to a separate computing unit, and the cameras communicate over a network. No central computer is in- volved in the computation. The algorithm runs with low computational and communic...
Article
This paper adds counterfactuals to the framework of knowledge-based programs of Fagin, Halpern, Moses, and Vardi. The use of counterfactuals is illustrated by designing a protocol in which an agent stops sending messages once it knows that it is safe to do so. Such behavior is difficult to capture in the original framework because it involves reaso...
Book
Full-text available
Reasoning about knowledge—particularly the knowledge of agents who reason about the world and each other's knowledge—was once the exclusive province of philosophers and puzzle solvers. More recently, this type of reasoning has been shown to play a key role in a surprising number of contexts, from understanding conversations to the analysis of distr...
Article
Full-text available
This paper introduces a simple notion of layering as a tool for analyzingwell-behaved runs of a given model of distributed computation. Using layering, a model-independent analysis of the consensus problem is performed and then applied to provinglower bounds and impossibility results for consensus in a number of familiar and less familiar models. T...
Article
This paper develops a highly expressive semantic framework for program refinement that supports both temporal reasoning and reasoning about the knowledge of a single agent. The framework generalizes a previously developed temporal refinement framework by amalgamating it with a logic of quantified local propositions, a generalization of the logic of...
Conference Paper
An expressive semantic framework for program refinement that supports both temporal reasoning and reasoning about the knowledge of multiple agents is developed. The refinement calculus owes the cleanliness of its decomposition rules for all programming language constructs and the relative simplicity of its semantic model to a rigid synchrony assump...
Conference Paper
An expressive semantic framework for program refinement that supports both temporal reasoning and reasoning about the knowledge of multiple agents is developed. The refinement calculus owes the cleanliness of its decomposition rules for all programming language constructs and the relative simplicity of its semantic model to a rigid synchrony assump...
Article
: While the intuition underlying a zero knowledge proof system [GMR85] is that no "knowledge " is leaked by the prover to the verifier, researchers are just beginning to analyze such proof systems in terms of formal notions of knowledge. In this paper, we show how interactive proof systems motivate a new notion of practical knowledge, and we captur...
Conference Paper
Developing correct computer programs is a notoriously difficult task, which has attracted a significant intellectual effort over the past decades. One attractive methodology that has been proposed to tackle this problem consists of systems for program refinement, in which a calculus is given for transforming, often in a top-down manner, the specifi...
Conference Paper
This paper develops a highly expressive semantic framework for program refinement that supports both temporal reasoning and reasoning about the knowledge of a single agent. The framework generalizes a previously developed temporal refinement framework by amalgamating it with a logic of quantified local propositions, a generalization of the logic of...
Article
Full-text available
. This paper presents a polynomial-time protocol for reaching Byzantine agreement in t + 1 rounds whenever n ? 3t, where n is the number of processors and t is an a priori upper bound on the number of failures. This resolves an open problem presented by Pease, Shostak and Lamport in 1980. An early-stopping variant of this protocol is also presented...
Article
: We show how counterfactuals can be added to the framework of knowledgebased programs of Fagin, Halpern, Moses, and Vardi [1995, 1997]. We show that counterfactuals allow us to capture in a natural way notions like minimizing the number of messages that are sent, whereas attempts to formalize these notions without counterfactuals lead to some rath...
Article
: Reasoning about knowledge seems to play a fundamental role in distributed systems. Indeed, such reasoning is a central part of the informal intuitive arguments used in the design of distributed protocols. Communication in a distributed system can be viewed as the act of transforming the system's state of knowledge. This paper presents a general f...
Article
: Reasoning about knowledge seems to play a fundamental role in distributed systems. Indeed, such reasoning is a central part of the informal intuitive arguments used in the design of distributed protocols. Communication in a distributed system can be viewed as the act of transforming the system's state of knowledge. This paper presents a general f...
Article
: We investigate eventual Byzantine agreement (EBA) in the crash and omission failure modes. The emphasis is on characterizing optimal EBA protocols in terms of the states of knowledge required by the processors in order to attain EBA. It is well known that common knowledge among the nonfaulty processors is a necessary and sufficient condition for...
Article
Full-text available
The standard approach in AI to knowledge representation is to represent an agent's knowledge symbolically as a collection of formulas, which we can view as a knowledge base. An agent is then said to know a fact if it is provable from the formulas in his knowledge base. Halpern and Vardi advocated a model-theoretic approach to knowledge representati...
Conference Paper
. This paper introduces the semantics of a wide spectrum language with a rich compositional structure that is able to represent both temporal specifications and sequential programs. A key feature of the language is the ability to represent partial correctness annotations expressed in temporal logic. A refinement relation is presented that enables r...
Article
We consider the common-knowledge paradox raised by Halpern and Moses: common knowledge is necessary for agreement and coordination, but common knowledge is unattainable in the real world because of temporal imprecision. We discuss two solutions to this paradox: 1.(1) modeling the world with a coarser granularity, and2.(2) relaxing the requirements...
Article
An agent's limited view of the state of a distributed system may render globally different situations indistinguishable. A proposition is local for this agent whenever his view suffices to decide this proposition. Motivated by a framework for the development of distributed programs from knowledge-based specifications, we introduce a modal logic of...
Conference Paper
Full-text available
This paper adds counterfactuals to the framework of knowledge-based programs of Fagin, Halpern, Moses, and Vardi [3,4]. The use of counterfactuals is illustrated by designing a protocol in which an agent stops sending messages once it knows that it is safe to do so. Such behavior is difficult to capture in the original framework because it involves...
Article
Full-text available
We introduce a simple notion of layering that provides a tool for defining submodels of a given model of distributed computation. We describe two layerings, the synchronic and the permutation layering, and show that they induce appropriate submodels of several asynchronous models of computation. The synchronic layering applies to the synchronous mo...
Conference Paper
An agent's limited view of the state of a distributed system may render globally different situations indistinguishable. A proposition is local for this agent whenever his view suffices to decide this proposition. Motivated by a framework for the development of distributed programs from knowledge-based specifications, we introduce a modal logic of...
Article
Full-text available
 Reasoning about activities in a distributed computer system at the level of the knowledge of individuals and groups allows us to abstract away from many concrete details of the system we are considering. In this paper, we make use of two notions introduced in our recent book to facilitate designing and reasoning about systems in terms of knowledge...
Article
Full-text available
this paper is an attempt to resolve the paradox of common knowledge: Although common knowledge can be shown to be a prerequisite for day-to-day activities of coordination and agreement, it can also be shown to be unattainable in practice. The resolution of this paradox leads to a deeper understanding of the nature of common knowledge and coordinati...