Rajeev Alur

University of Pennsylvania, Filadelfia, Pennsylvania, United States

Are you Rajeev Alur?

Claim your profile

Publications (287)61.52 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Given a specification and a set of candidate programs (program space), the program synthesis problem is to find a candidate program that satisfies the specification. We present the synthesis through unification (STUN) approach, which is an extension of the counter-example guided inductive synthesis (CEGIS) approach. In CEGIS, the synthesizer maintains a subset S of inputs and a candidate program Prog that is correct for S. The synthesizer repeatedly checks if there exists a counter-example input c such that the execution of Prog is incorrect on c. If so, the synthesizer enlarges S to include c, and picks a program from the program space that is correct for the new set S. The STUN approach extends CEGIS with the idea that given a program Prog that is correct for a subset of inputs, the synthesizer can try to find a program Prog' that is correct for the rest of the inputs. If Prog and Prog' can be unified into a program in the program space, then a solution has been found. We present a generic synthesis procedure based on the STUN approach and specialize it for three different domains by providing the appropriate unification operators. We implemented these specializations in prototype tools, and we show that our tools often per- forms significantly better on standard benchmarks than a tool based on a pure CEGIS approach.
  • [Show abstract] [Hide abstract]
    ABSTRACT: A distributed protocol is typically modeled as a set of communicating processes, where each process is described as an extended state machine along with fairness assumptions, and its correctness is specified using safety and liveness requirements. Designing correct distributed protocols is a challenging task. Aimed at simplifying this task, we allow the designer to leave some of the guards and updates to state variables in the description of extended state machines as unknown functions. The protocol completion problem then is to find interpretations for these unknown functions while guaranteeing correctness. In many distributed protocols, process behaviors are naturally symmetric, and thus, synthesized expressions are further required to obey symmetry constraints. Our counterexample-guided synthesis algorithm consists of repeatedly invoking two phases. In the first phase, candidates for unknown expressions are generated using the SMT solver Z3. This phase requires carefully orchestrating constraints to enforce the desired symmetry in read/write accesses. In the second phase, the resulting completed protocol is checked for correctness using a custom-built model checker that handles fairness assumptions, safety and liveness requirements, and exploits symmetry. When model checking fails, our tool examines a set of counterexamples to safety/liveness properties to generate constraints on unknown functions that must be satisfied by subsequent completions. For evaluation, we show that our prototype is able to automatically discover interesting missing details in distributed protocols for mutual exclusion, self stabilization, and cache coherence.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In computer-aided education, the goal of automatic feedback is to provide a meaningful explanation of stu-dents' mistakes. We focus on providing feedback for constructing a deterministic finite automaton that accepts strings that match a described pattern. Natural choices for feedback are binary feedback (correct/wrong) and a counterexample of a string that is processed incorrectly. Such feedback is easy to compute but might not provide the student enough help. Our first contribution is a novel way to automatically compute alternative conceptual hints. Our second contribution is a rigorous evaluation of feedback with 377 students. We find that providing either counterexamples or hints is judged as helpful, increases student perseverance, and can improve problem completion time. However, both strategies have particular strengths and weaknesses. Since our feedback is completely automatic, it can be deployed at scale and integrated into existing massive open online courses.
    ACM Transactions on Computer-Human Interaction 03/2015; 22(2). DOI:10.1145/2723163 · 0.57 Impact Factor
  • Article: DReX
    ACM SIGPLAN Notices 01/2015; 50(1):125-137. DOI:10.1145/2775051.2676981 · 0.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present DReX, a declarative language that can express all regular string-to-string transformations, and can still be efficiently evaluated. The class of regular string transformations has a robust theoretical foundation including multiple characterizations, closure properties, and decidable analysis questions, and admits a number of string operations such as insertion, deletion, substring swap, and reversal. Recent research has led to a characterization of regular string transformations using a primitive set of function combinators analogous to the definition of regular languages using regular expressions. While these combinators form the basis for the language DReX proposed in this paper, our main technical focus is on the complexity of evaluating the output of a DReX program on a given input string. It turns out that the natural evaluation algorithm involves dynamic programming, leading to complexity that is cubic in the length of the input string. Our main contribution is identifying a consistency restriction on the use of combinators in DReX programs, and a single-pass evaluation algorithm for consistent programs with time complexity that is linear in the length of the input string and polynomial in the size of the program. We show that the consistency restriction does not limit the expressiveness, and whether a DReX program is consistent can be checked efficiently. We report on a prototype implementation, and evaluate it using a representative set of text processing tasks.
    POPL15; 01/2015
  • Source
    Loris D 'antoni, Rajeev Alur
    [Show abstract] [Hide abstract]
    ABSTRACT: Nested words model data with both linear and hierarchical structure such as XML documents and program traces. A nested word is a sequence of positions together with a matching relation that connects open tags (calls) with the corresponding close tags (returns). Visibly Pushdown Automata are a restricted class of pushdown automata that process nested words, and have many appealing theoretical properties such as closure under Boolean operations and decidable equivalence. However, like any classical automata models, they are limited to finite alphabets. This limitation is restrictive for practical applications to both XML processing and program trace analysis, where values for individual symbols are usually drawn from an unbounded domain. With this motivation , we introduce Symbolic Visibly Pushdown Automata (SVPA) as an executable model for nested words over infinite alphabets. In this model, transitions are labeled with first-order predicates over the input alphabet, analogous to symbolic automata processing strings over infinite alphabets. A key novelty of SVPAs is the use of binary predicates to model relations between open and close tags in a nested word. We show how SVPAs still enjoy the decidability and closure properties of Visibly Pushdown Automata. We use SVPAs to model XML validation policies and program properties that are not naturally expressible with previous formalisms and provide experimental results for our implementation.
    CAV14; 07/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Scenarios, or Message Sequence Charts, offer an intuitive way of describing the desired behaviors of a distributed protocol. In this paper we propose a new way of specifying finite-state protocols using scenarios: we show that it is possible to automatically derive a distributed implementation from a set of scenarios augmented with a set of safety and liveness requirements, provided the given scenarios adequately \emph{cover} all the states of the desired implementation. We first derive incomplete state machines from the given scenarios, and then synthesis corresponds to completing the transition relation of individual processes so that the global product meets the specified requirements. This completion problem, in general, has the same complexity, PSPACE, as the verification problem, but unlike the verification problem, is NP-complete for a constant number of processes. We present two algorithms for solving the completion problem, one based on a heuristic search in the space of possible completions and one based on OBDD-based symbolic fixpoint computation. We evaluate the proposed methodology for protocol specification and the effectiveness of the synthesis algorithms using the classical alternating-bit protocol.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We focus on (partial) functions that map input strings to a monoid such as the set of integers with addition and the set of output strings with concatenation. The notion of regularity for such functions has been defined using two-way finite-state transducers, (one-way) cost register automata, and MSO-definable graph transformations. In this paper, we give an algebraic and machine-independent characterization of this class analogous to the definition of regular languages by regular expressions. When the monoid is commutative, we prove that every regular function can be constructed from constant functions using the combinators of choice, split sum, and iterated sum, that are analogs of union, concatenation, and Kleene-*, respectively, but enforce unique (or unambiguous) parsing. Our main result is for the general case of non-commutative monoids, which is of particular interest for capturing regular string-to-string transformations for document processing. We prove that the following additional combinators suffice for constructing all regular functions: (1) the left-additive versions of split sum and iterated sum, which allow transformations such as string reversal; (2) sum of functions, which allows transformations such as copying of strings; and (3) function composition, or alternatively, a new concept of chained sum, which allows output values from adjacent blocks to mix.
  • [Show abstract] [Hide abstract]
    ABSTRACT: The design and implementation of software for medical devices is challenging due to the closed-loop interaction with the patient, which is a stochastic physical environment. The safety-critical nature and the lack of existing industry standards for verification make this an ideal domain for exploring applications of formal modeling and closed-loop analysis. The biggest challenge is that the environment model(s) have to be both complex enough to express the physiological requirements and general enough to cover all possible inputs to the device. In this effort, we use a dual chamber implantable pacemaker as a case study to demonstrate verification of software specifications of medical devices as timed-automata models in UPPAAL. The pacemaker model is based on the specifications and algorithm descriptions from Boston Scientific. The heart is modeled using timed automata based on the physiology of heart. The model is gradually abstracted with timed simulation to preserve properties. A manual Counter-Example-Guided Abstraction and Refinement (CEGAR) framework has been adapted to refine the heart model when spurious counter-examples are found. To demonstrate the closed-loop nature of the problem and heart model refinement, we investigated two clinical cases of Pacemaker Mediated Tachycardia and verified their corresponding correction algorithms in the pacemaker. Along with our tools for code generation from UPPAAL models, this effort enables model-driven design and certification of software for medical devices.
    International Journal on Software Tools for Technology Transfer 01/2014; 16(2). DOI:10.1007/s10009-013-0289-7
  • Source
    Rajeev Alur, Salar Moarref, Ufuk Topcu
    [Show abstract] [Hide abstract]
    ABSTRACT: The reactive synthesis problem is to find a finite-state controller that satisfies a given temporal-logic specification regardless of how its environment behaves. Developing a formal specification is a challenging and tedious task and initial specifications are often unrealizable. In many cases, the source of unrealizability is the lack of adequate assumptions on the environment of the system. In this paper, we consider the problem of automatically correcting an unrealizable specification given in the generalized reactivity (1) fragment of linear temporal logic by adding assumptions on the environment. When a temporal-logic specification is unrealizable, the synthesis algorithm computes a counter-strategy as a witness. Our algorithm then analyzes this counter-strategy and synthesizes a set of candidate environment assumptions that can be used to remove the counter-strategy from the environment's possible behaviors. We demonstrate the applicability of our approach with several case studies.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: One challenge in making online education more effective is to develop automatic grading software that can provide meaningful feedback. This paper provides a solution to automatic grading of the standard computation-theory problem that asks a student to construct a deterministic finite automaton (DFA) from the given description of its language. We focus on how to assign partial grades for incorrect answers. Each student's answer is compared to the correct DFA using a hybrid of three techniques devised to capture different classes of errors. First, in an attempt to catch syntactic mistakes, we compute the edit distance between the two DFA descriptions. Second, we consider the entropy of the symmetric difference of the languages of the two DFAs, and compute a score that estimates the fraction of the number of strings on which the student answer is wrong. Our third technique is aimed at capturing mistakes in reading of the problem description. For this purpose, we consider a description language MOSEL, which adds syntactic sugar to the classical Monadic Second Order Logic, and allows defining regular languages in a concise and natural way. We provide algorithms, along with optimizations, for transforming MOSEL descriptions into DFAs and vice-versa. These allow us to compute the syntactic edit distance of the incorrect answer from the correct one in terms of their logical representations. We report an experimental study that evaluates hundreds of answers submitted by (real) students by comparing grades/feedback computed by our tool with human graders. Our conclusion is that the tool is able to assign partial grades in a meaningful way, and should be preferred over the human graders for both scalability and consistency.
    Proceedings of the Twenty-Third international joint conference on Artificial Intelligence; 08/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a deterministic model for associating costs with strings that is parameterized by operations of interest (such as addition, scaling, and minimum), a notion of regularity that provides a yardstick to measure expressiveness, and study decision problems and theoretical properties of resulting classes of cost functions. Our definition of regularity relies on the theory of string-to-tree transducers, and allows associating costs with events that are conditioned on regular properties of future events. Our model of cost register automata allows computation of regular functions using multiple "write-only" registers whose values can be combined using the allowed set of operations. We show that the classical shortest-path algorithms as well as the algorithms designed for computing discounted costs can be adapted for solving the min-cost problems for the more general classes of functions specified in our model. Cost register automata with the operations of minimum and increment give a deterministic model that is equivalent to weighted automata, an extensively studied nondeterministic model, and this connection results in new insights and new open problems.
    Proceedings of the 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science; 06/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the maturing of technology for model checking and constraint solving, there is an emerging opportunity to develop programming tools that can transform the way systems are specified. In this paper, we propose a new way to program distributed protocols using concolic snippets. Concolic snippets are sample execution fragments that contain both concrete and symbolic values. The proposed approach allows the programmer to describe the desired system partially using the traditional model of communicating extended finite-state-machines (EFSM), along with high-level invariants and concrete execution fragments. Our synthesis engine completes an EFSM skeleton by inferring guards and updates from the given fragments which is then automatically analyzed using a model checker with respect to the desired invariants. The counterexamples produced by the model checker can then be used by the programmer to add new concrete execution fragments that describe the correct behavior in the specific scenario corresponding to the counterexample. We describe TRANSIT, a language and prototype implementation of the proposed specification methodology for distributed protocols. Experimental evaluations of TRANSIT to specify cache coherence protocols show that (1) the algorithm for expression inference from concolic snippets can synthesize expressions of size 15 involving typical operators over commonly occurring types, (2) for a classical directory-based protocol, TRANSIT automatically generates, in a few seconds, a complete implementation from a specification consisting of the EFSM structure and a few concrete examples for every transition, and (3) a published partial description of the SGI Origin cache coherence protocol maps directly to symbolic examples and leads to a complete implementation in a few iterations, with the programmer correcting counterexamples resulting from underspecified transitions by adding concrete examples in each iteration.
    Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation; 06/2013
  • Source
    Rajeev Alur, Mukund Raghothaman
    [Show abstract] [Hide abstract]
    ABSTRACT: Additive Cost Register Automata (ACRA) map strings to integers using a finite set of registers that are updated using assignments of the form "x := y + c" at every step. The corresponding class of additive regular functions has multiple equivalent characterizations, appealing closure properties, and a decidable equivalence problem. In this paper, we solve two decision problems for this model. First, we define the register complexity of an additive regular function to be the minimum number of registers that an ACRA needs to compute it. We characterize the register complexity by a necessary and sufficient condition regarding the largest subset of registers whose values can be made far apart from one another. We then use this condition to design a PSPACE algorithm to compute the register complexity of a given ACRA, and establish a matching lower bound. Our results also lead to a machine-independent characterization of the register complexity of additive regular functions. Second, we consider two-player games over ACRAs, where the objective of one of the players is to reach a target set while minimizing the cost. We show the corresponding decision problem to be EXPTIME-complete when costs are non-negative integers, but undecidable when costs are integers.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Abstract We consider a resource allocation problem that ensures a fair QoS (Quality of Service) level among selfish clients in a cloud computing system. The clients share multiple resources and process applications concurrently on the cloud computing ...
    Proceedings of the 2nd ACM international conference on High confidence networked systems; 04/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Bounded-rate multi-mode systems (BMMS) are hybrid systems that can switch freely among a finite set of modes, and whose dynamics is specified by a finite number of real-valued variables with mode-dependent rates that can vary within given bounded sets. The schedulability problem for BMMS is defined as an infinite-round game between two players---the scheduler and the environment---where in each round the scheduler proposes a time and a mode while the environment chooses an allowable rate for that mode, and the state of the system changes linearly in the direction of the rate vector. The goal of the scheduler is to keep the state of the system within a pre-specified safe set using a non-Zeno schedule, while the goal of the environment is the opposite. Green scheduling under uncertainty is a paradigmatic example of BMMS where a winning strategy of the scheduler corresponds to a robust energy-optimal policy. We present an algorithm to decide whether the scheduler has a winning strategy from an arbitrary starting state, and give an algorithm to compute such a winning strategy, if it exists. We show that the schedulability problem for BMMS is co-NP complete in general, but for two variables it is in PTIME. We also study the discrete schedulability problem where the environment has only finitely many choices of rate vectors in each mode and the scheduler can make decisions only at multiples of a given clock period, and show it to be EXPTIME-complete.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Courcelle (1992) proposed the idea of using logic, in particular Monadic second-order logic (MSO), to define graph to graph transformations. Transducers, on the other hand, are executable machine models to define transformations, and are typically studied in the context of string-to-string transformations. Engelfriet and Hoogeboom (2001) studied two-way finite state string-to-string transducers and showed that their expressiveness matches MSO-definable transformations (MSOT). Alur and Cerny (2011) presented streaming transducers-one-way transducers equipped with multiple registers that can store output strings, as an equi-expressive model. Natural generalizations of streaming transducers to string-to-tree (Alur and D'Antoni, 2012) and infinite-string-to-string (Alur, Filiot, and Trivedi, 2012) cases preserve MSO-expressiveness. While earlier reductions from MSOT to streaming transducers used two-way transducers as the intermediate model, we revisit the earlier reductions in a more general, and previously unexplored, setting of infinite-string-to-tree transformations, and provide a direct reduction. Proof techniques used for this new reduction exploit the conceptual tools (composition theorem and finite additive coloring theorem) presented by Shelah (1975) in his alternative proof of Bϋchi's theorem. Using such streaming string-to-tree transducers we show the decidability of functional equivalence for MSO-definable infinite-string-to-tree transducers.
    Logic in Computer Science (LICS), 2013 28th Annual IEEE/ACM Symposium on; 01/2013
  • Conference Paper: Syntax-guided synthesis
    [Show abstract] [Hide abstract]
    ABSTRACT: The classical formulation of the program-synthesis problem is to find a program that meets a correctness specification given as a logical formula. Recent work on program synthesis and program optimization illustrates many potential benefits of allowing the user to supplement the logical specification with a syntactic template that constrains the space of allowed implementations. Our goal is to identify the core computational problem common to these proposals in a logical framework. The input to the syntax-guided synthesis problem (SyGuS) consists of a background theory, a semantic correctness specification for the desired program given by a logical formula, and a syntactic set of candidate implementations given by a grammar. The computational problem then is to find an implementation from the set of candidate expressions so that it satisfies the specification in the given theory. We describe three different instantiations of the counter-example-guided-inductive-synthesis (CEGIS) strategy for solving the synthesis problem, report on prototype implementations, and present experimental results on an initial set of benchmarks.
    Formal Methods in Computer-Aided Design (FMCAD), 2013; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Mapping virtual networks to physical networks under bandwidth constraints is a key computational problem for the management of data centers. Recently proposed heuristic strategies for this problem work efficiently, but are not guaranteed to always find an allocation even when one exists. Given that the bandwidth allocation problem is NP-complete, and the state-of-the-art SAT solvers have recently been successfully applied to NP-hard problems in planning and formal verification, the goal of this paper is to study whether these SAT solvers can be used to solve the bandwidth allocation problem exactly with acceptable overhead. We investigate alternative ways of encoding the allocation problem, and develop techniques for abstraction and refinement of network graphs for scalability. We report experimental comparisons of the proposed encodings with the existing heuristics for typical data-center topologies.
    Formal Methods in Computer-Aided Design (FMCAD), 2013; 01/2013

Publication Stats

22k Citations
61.52 Total Impact Points

Institutions

  • 1970–2015
    • University of Pennsylvania
      • • Department of Computer and Information Science
      • • Department of Electrical and Systems Engineering
      Filadelfia, Pennsylvania, United States
  • 2013
    • Indian Institute of Technology Bombay
      Mumbai, Maharashtra, India
  • 2006–2013
    • William Penn University
      Filadelfia, Pennsylvania, United States
  • 1996–2006
    • University of California, Berkeley
      • Department of Electrical Engineering and Computer Sciences
      Berkeley, CA, United States
  • 1990–2006
    • Stanford University
      • Department of Computer Science
      Palo Alto, California, United States
  • 2000
    • The University of Edinburgh
      • Laboratory for Foundations of Computer Science (LFCS)
      Edinburgh, Scotland, United Kingdom
  • 1992–1997
    • AT&T Labs
      Austin, Texas, United States