Publications (281)60.49 Total impact

Article: Synthesis through Unification
[Show abstract] [Hide abstract]
ABSTRACT: Given a specification and a set of candidate programs (program space), the program synthesis problem is to find a candidate program that satisfies the specification. We present the synthesis through unification (STUN) approach, which is an extension of the counterexample guided inductive synthesis (CEGIS) approach. In CEGIS, the synthesizer maintains a subset S of inputs and a candidate program Prog that is correct for S. The synthesizer repeatedly checks if there exists a counterexample input c such that the execution of Prog is incorrect on c. If so, the synthesizer enlarges S to include c, and picks a program from the program space that is correct for the new set S. The STUN approach extends CEGIS with the idea that given a program Prog that is correct for a subset of inputs, the synthesizer can try to find a program Prog' that is correct for the rest of the inputs. If Prog and Prog' can be unified into a program in the program space, then a solution has been found. We present a generic synthesis procedure based on the STUN approach and specialize it for three different domains by providing the appropriate unification operators. We implemented these specializations in prototype tools, and we show that our tools often per forms significantly better on standard benchmarks than a tool based on a pure CEGIS approach.  [Show abstract] [Hide abstract]
ABSTRACT: A distributed protocol is typically modeled as a set of communicating processes, where each process is described as an extended state machine along with fairness assumptions, and its correctness is specified using safety and liveness requirements. Designing correct distributed protocols is a challenging task. Aimed at simplifying this task, we allow the designer to leave some of the guards and updates to state variables in the description of extended state machines as unknown functions. The protocol completion problem then is to find interpretations for these unknown functions while guaranteeing correctness. In many distributed protocols, process behaviors are naturally symmetric, and thus, synthesized expressions are further required to obey symmetry constraints. Our counterexampleguided synthesis algorithm consists of repeatedly invoking two phases. In the first phase, candidates for unknown expressions are generated using the SMT solver Z3. This phase requires carefully orchestrating constraints to enforce the desired symmetry in read/write accesses. In the second phase, the resulting completed protocol is checked for correctness using a custombuilt model checker that handles fairness assumptions, safety and liveness requirements, and exploits symmetry. When model checking fails, our tool examines a set of counterexamples to safety/liveness properties to generate constraints on unknown functions that must be satisfied by subsequent completions. For evaluation, we show that our prototype is able to automatically discover interesting missing details in distributed protocols for mutual exclusion, self stabilization, and cache coherence.  [Show abstract] [Hide abstract]
ABSTRACT: In computeraided education, the goal of automatic feedback is to provide a meaningful explanation of students' mistakes. We focus on providing feedback for constructing a deterministic finite automaton that accepts strings that match a described pattern. Natural choices for feedback are binary feedback (correct/wrong) and a counterexample of a string that is processed incorrectly. Such feedback is easy to compute but might not provide the student enough help. Our first contribution is a novel way to automatically compute alternative conceptual hints. Our second contribution is a rigorous evaluation of feedback with 377 students. We find that providing either counterexamples or hints is judged as helpful, increases student perseverance, and can improve problem completion time. However, both strategies have particular strengths and weaknesses. Since our feedback is completely automatic, it can be deployed at scale and integrated into existing massive open online courses.ACM Transactions on ComputerHuman Interaction 03/2015; 22(2). DOI:10.1145/2723163 · 0.57 Impact Factor 
Article: DReX
ACM SIGPLAN Notices 01/2015; 50(1):125137. DOI:10.1145/2775051.2676981 · 0.62 Impact Factor 
Conference Paper: DReX: A Declarative Language for Efficiently Evaluating Regular String Transformations
[Show abstract] [Hide abstract]
ABSTRACT: We present DReX, a declarative language that can express all regular stringtostring transformations, and can still be efficiently evaluated. The class of regular string transformations has a robust theoretical foundation including multiple characterizations, closure properties, and decidable analysis questions, and admits a number of string operations such as insertion, deletion, substring swap, and reversal. Recent research has led to a characterization of regular string transformations using a primitive set of function combinators analogous to the definition of regular languages using regular expressions. While these combinators form the basis for the language DReX proposed in this paper, our main technical focus is on the complexity of evaluating the output of a DReX program on a given input string. It turns out that the natural evaluation algorithm involves dynamic programming, leading to complexity that is cubic in the length of the input string. Our main contribution is identifying a consistency restriction on the use of combinators in DReX programs, and a singlepass evaluation algorithm for consistent programs with time complexity that is linear in the length of the input string and polynomial in the size of the program. We show that the consistency restriction does not limit the expressiveness, and whether a DReX program is consistent can be checked efficiently. We report on a prototype implementation, and evaluate it using a representative set of text processing tasks.POPL15; 01/2015 
Conference Paper: Symbolic Visibly Pushdown Automata
[Show abstract] [Hide abstract]
ABSTRACT: Nested words model data with both linear and hierarchical structure such as XML documents and program traces. A nested word is a sequence of positions together with a matching relation that connects open tags (calls) with the corresponding close tags (returns). Visibly Pushdown Automata are a restricted class of pushdown automata that process nested words, and have many appealing theoretical properties such as closure under Boolean operations and decidable equivalence. However, like any classical automata models, they are limited to finite alphabets. This limitation is restrictive for practical applications to both XML processing and program trace analysis, where values for individual symbols are usually drawn from an unbounded domain. With this motivation , we introduce Symbolic Visibly Pushdown Automata (SVPA) as an executable model for nested words over infinite alphabets. In this model, transitions are labeled with firstorder predicates over the input alphabet, analogous to symbolic automata processing strings over infinite alphabets. A key novelty of SVPAs is the use of binary predicates to model relations between open and close tags in a nested word. We show how SVPAs still enjoy the decidability and closure properties of Visibly Pushdown Automata. We use SVPAs to model XML validation policies and program properties that are not naturally expressible with previous formalisms and provide experimental results for our implementation.CAV14; 07/2014  [Show abstract] [Hide abstract]
ABSTRACT: Scenarios, or Message Sequence Charts, offer an intuitive way of describing the desired behaviors of a distributed protocol. In this paper we propose a new way of specifying finitestate protocols using scenarios: we show that it is possible to automatically derive a distributed implementation from a set of scenarios augmented with a set of safety and liveness requirements, provided the given scenarios adequately \emph{cover} all the states of the desired implementation. We first derive incomplete state machines from the given scenarios, and then synthesis corresponds to completing the transition relation of individual processes so that the global product meets the specified requirements. This completion problem, in general, has the same complexity, PSPACE, as the verification problem, but unlike the verification problem, is NPcomplete for a constant number of processes. We present two algorithms for solving the completion problem, one based on a heuristic search in the space of possible completions and one based on OBDDbased symbolic fixpoint computation. We evaluate the proposed methodology for protocol specification and the effectiveness of the synthesis algorithms using the classical alternatingbit protocol.  [Show abstract] [Hide abstract]
ABSTRACT: We focus on (partial) functions that map input strings to a monoid such as the set of integers with addition and the set of output strings with concatenation. The notion of regularity for such functions has been defined using twoway finitestate transducers, (oneway) cost register automata, and MSOdefinable graph transformations. In this paper, we give an algebraic and machineindependent characterization of this class analogous to the definition of regular languages by regular expressions. When the monoid is commutative, we prove that every regular function can be constructed from constant functions using the combinators of choice, split sum, and iterated sum, that are analogs of union, concatenation, and Kleene*, respectively, but enforce unique (or unambiguous) parsing. Our main result is for the general case of noncommutative monoids, which is of particular interest for capturing regular stringtostring transformations for document processing. We prove that the following additional combinators suffice for constructing all regular functions: (1) the leftadditive versions of split sum and iterated sum, which allow transformations such as string reversal; (2) sum of functions, which allows transformations such as copying of strings; and (3) function composition, or alternatively, a new concept of chained sum, which allows output values from adjacent blocks to mix.  [Show abstract] [Hide abstract]
ABSTRACT: The design and implementation of software for medical devices is challenging due to the closedloop interaction with the patient, which is a stochastic physical environment. The safetycritical nature and the lack of existing industry standards for verification make this an ideal domain for exploring applications of formal modeling and closedloop analysis. The biggest challenge is that the environment model(s) have to be both complex enough to express the physiological requirements and general enough to cover all possible inputs to the device. In this effort, we use a dual chamber implantable pacemaker as a case study to demonstrate verification of software specifications of medical devices as timedautomata models in UPPAAL. The pacemaker model is based on the specifications and algorithm descriptions from Boston Scientific. The heart is modeled using timed automata based on the physiology of heart. The model is gradually abstracted with timed simulation to preserve properties. A manual CounterExampleGuided Abstraction and Refinement (CEGAR) framework has been adapted to refine the heart model when spurious counterexamples are found. To demonstrate the closedloop nature of the problem and heart model refinement, we investigated two clinical cases of Pacemaker Mediated Tachycardia and verified their corresponding correction algorithms in the pacemaker. Along with our tools for code generation from UPPAAL models, this effort enables modeldriven design and certification of software for medical devices.International Journal on Software Tools for Technology Transfer 01/2014; 16(2). DOI:10.1007/s1000901302897  [Show abstract] [Hide abstract]
ABSTRACT: The reactive synthesis problem is to find a finitestate controller that satisfies a given temporallogic specification regardless of how its environment behaves. Developing a formal specification is a challenging and tedious task and initial specifications are often unrealizable. In many cases, the source of unrealizability is the lack of adequate assumptions on the environment of the system. In this paper, we consider the problem of automatically correcting an unrealizable specification given in the generalized reactivity (1) fragment of linear temporal logic by adding assumptions on the environment. When a temporallogic specification is unrealizable, the synthesis algorithm computes a counterstrategy as a witness. Our algorithm then analyzes this counterstrategy and synthesizes a set of candidate environment assumptions that can be used to remove the counterstrategy from the environment's possible behaviors. We demonstrate the applicability of our approach with several case studies. 
Conference Paper: Automated grading of DFA constructions
[Show abstract] [Hide abstract]
ABSTRACT: One challenge in making online education more effective is to develop automatic grading software that can provide meaningful feedback. This paper provides a solution to automatic grading of the standard computationtheory problem that asks a student to construct a deterministic finite automaton (DFA) from the given description of its language. We focus on how to assign partial grades for incorrect answers. Each student's answer is compared to the correct DFA using a hybrid of three techniques devised to capture different classes of errors. First, in an attempt to catch syntactic mistakes, we compute the edit distance between the two DFA descriptions. Second, we consider the entropy of the symmetric difference of the languages of the two DFAs, and compute a score that estimates the fraction of the number of strings on which the student answer is wrong. Our third technique is aimed at capturing mistakes in reading of the problem description. For this purpose, we consider a description language MOSEL, which adds syntactic sugar to the classical Monadic Second Order Logic, and allows defining regular languages in a concise and natural way. We provide algorithms, along with optimizations, for transforming MOSEL descriptions into DFAs and viceversa. These allow us to compute the syntactic edit distance of the incorrect answer from the correct one in terms of their logical representations. We report an experimental study that evaluates hundreds of answers submitted by (real) students by comparing grades/feedback computed by our tool with human graders. Our conclusion is that the tool is able to assign partial grades in a meaningful way, and should be preferred over the human graders for both scalability and consistency.Proceedings of the TwentyThird international joint conference on Artificial Intelligence; 08/2013 
Conference Paper: Regular Functions and Cost Register Automata
[Show abstract] [Hide abstract]
ABSTRACT: We propose a deterministic model for associating costs with strings that is parameterized by operations of interest (such as addition, scaling, and minimum), a notion of regularity that provides a yardstick to measure expressiveness, and study decision problems and theoretical properties of resulting classes of cost functions. Our definition of regularity relies on the theory of stringtotree transducers, and allows associating costs with events that are conditioned on regular properties of future events. Our model of cost register automata allows computation of regular functions using multiple "writeonly" registers whose values can be combined using the allowed set of operations. We show that the classical shortestpath algorithms as well as the algorithms designed for computing discounted costs can be adapted for solving the mincost problems for the more general classes of functions specified in our model. Cost register automata with the operations of minimum and increment give a deterministic model that is equivalent to weighted automata, an extensively studied nondeterministic model, and this connection results in new insights and new open problems.Proceedings of the 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science; 06/2013 
Conference Paper: TRANSIT: specifying protocols with concolic snippets
[Show abstract] [Hide abstract]
ABSTRACT: With the maturing of technology for model checking and constraint solving, there is an emerging opportunity to develop programming tools that can transform the way systems are specified. In this paper, we propose a new way to program distributed protocols using concolic snippets. Concolic snippets are sample execution fragments that contain both concrete and symbolic values. The proposed approach allows the programmer to describe the desired system partially using the traditional model of communicating extended finitestatemachines (EFSM), along with highlevel invariants and concrete execution fragments. Our synthesis engine completes an EFSM skeleton by inferring guards and updates from the given fragments which is then automatically analyzed using a model checker with respect to the desired invariants. The counterexamples produced by the model checker can then be used by the programmer to add new concrete execution fragments that describe the correct behavior in the specific scenario corresponding to the counterexample. We describe TRANSIT, a language and prototype implementation of the proposed specification methodology for distributed protocols. Experimental evaluations of TRANSIT to specify cache coherence protocols show that (1) the algorithm for expression inference from concolic snippets can synthesize expressions of size 15 involving typical operators over commonly occurring types, (2) for a classical directorybased protocol, TRANSIT automatically generates, in a few seconds, a complete implementation from a specification consisting of the EFSM structure and a few concrete examples for every transition, and (3) a published partial description of the SGI Origin cache coherence protocol maps directly to symbolic examples and leads to a complete implementation in a few iterations, with the programmer correcting counterexamples resulting from underspecified transitions by adding concrete examples in each iteration.Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation; 06/2013  [Show abstract] [Hide abstract]
ABSTRACT: Additive Cost Register Automata (ACRA) map strings to integers using a finite set of registers that are updated using assignments of the form "x := y + c" at every step. The corresponding class of additive regular functions has multiple equivalent characterizations, appealing closure properties, and a decidable equivalence problem. In this paper, we solve two decision problems for this model. First, we define the register complexity of an additive regular function to be the minimum number of registers that an ACRA needs to compute it. We characterize the register complexity by a necessary and sufficient condition regarding the largest subset of registers whose values can be made far apart from one another. We then use this condition to design a PSPACE algorithm to compute the register complexity of a given ACRA, and establish a matching lower bound. Our results also lead to a machineindependent characterization of the register complexity of additive regular functions. Second, we consider twoplayer games over ACRAs, where the objective of one of the players is to reach a target set while minimizing the cost. We show the corresponding decision problem to be EXPTIMEcomplete when costs are nonnegative integers, but undecidable when costs are integers. 
Conference Paper: Towards synthesis of platformaware attackresilient control systems: extended abstract
[Show abstract] [Hide abstract]
ABSTRACT: Abstract We consider a resource allocation problem that ensures a fair QoS (Quality of Service) level among selfish clients in a cloud computing system. The clients share multiple resources and process applications concurrently on the cloud computing ...Proceedings of the 2nd ACM international conference on High confidence networked systems; 04/2013  [Show abstract] [Hide abstract]
ABSTRACT: Boundedrate multimode systems (BMMS) are hybrid systems that can switch freely among a finite set of modes, and whose dynamics is specified by a finite number of realvalued variables with modedependent rates that can vary within given bounded sets. The schedulability problem for BMMS is defined as an infiniteround game between two playersthe scheduler and the environmentwhere in each round the scheduler proposes a time and a mode while the environment chooses an allowable rate for that mode, and the state of the system changes linearly in the direction of the rate vector. The goal of the scheduler is to keep the state of the system within a prespecified safe set using a nonZeno schedule, while the goal of the environment is the opposite. Green scheduling under uncertainty is a paradigmatic example of BMMS where a winning strategy of the scheduler corresponds to a robust energyoptimal policy. We present an algorithm to decide whether the scheduler has a winning strategy from an arbitrary starting state, and give an algorithm to compute such a winning strategy, if it exists. We show that the schedulability problem for BMMS is coNP complete in general, but for two variables it is in PTIME. We also study the discrete schedulability problem where the environment has only finitely many choices of rate vectors in each mode and the scheduler can make decisions only at multiples of a given clock period, and show it to be EXPTIMEcomplete.  [Show abstract] [Hide abstract]
ABSTRACT: Courcelle (1992) proposed the idea of using logic, in particular Monadic secondorder logic (MSO), to define graph to graph transformations. Transducers, on the other hand, are executable machine models to define transformations, and are typically studied in the context of stringtostring transformations. Engelfriet and Hoogeboom (2001) studied twoway finite state stringtostring transducers and showed that their expressiveness matches MSOdefinable transformations (MSOT). Alur and Cerny (2011) presented streaming transducersoneway transducers equipped with multiple registers that can store output strings, as an equiexpressive model. Natural generalizations of streaming transducers to stringtotree (Alur and D'Antoni, 2012) and infinitestringtostring (Alur, Filiot, and Trivedi, 2012) cases preserve MSOexpressiveness. While earlier reductions from MSOT to streaming transducers used twoway transducers as the intermediate model, we revisit the earlier reductions in a more general, and previously unexplored, setting of infinitestringtotree transformations, and provide a direct reduction. Proof techniques used for this new reduction exploit the conceptual tools (composition theorem and finite additive coloring theorem) presented by Shelah (1975) in his alternative proof of Bϋchi's theorem. Using such streaming stringtotree transducers we show the decidability of functional equivalence for MSOdefinable infinitestringtotree transducers.Logic in Computer Science (LICS), 2013 28th Annual IEEE/ACM Symposium on; 01/2013 
Conference Paper: Syntaxguided synthesis
[Show abstract] [Hide abstract]
ABSTRACT: The classical formulation of the programsynthesis problem is to find a program that meets a correctness specification given as a logical formula. Recent work on program synthesis and program optimization illustrates many potential benefits of allowing the user to supplement the logical specification with a syntactic template that constrains the space of allowed implementations. Our goal is to identify the core computational problem common to these proposals in a logical framework. The input to the syntaxguided synthesis problem (SyGuS) consists of a background theory, a semantic correctness specification for the desired program given by a logical formula, and a syntactic set of candidate implementations given by a grammar. The computational problem then is to find an implementation from the set of candidate expressions so that it satisfies the specification in the given theory. We describe three different instantiations of the counterexampleguidedinductivesynthesis (CEGIS) strategy for solving the synthesis problem, report on prototype implementations, and present experimental results on an initial set of benchmarks.Formal Methods in ComputerAided Design (FMCAD), 2013; 01/2013 
Conference Paper: On the feasibility of automation for bandwidth allocation problems in data centers
[Show abstract] [Hide abstract]
ABSTRACT: Mapping virtual networks to physical networks under bandwidth constraints is a key computational problem for the management of data centers. Recently proposed heuristic strategies for this problem work efficiently, but are not guaranteed to always find an allocation even when one exists. Given that the bandwidth allocation problem is NPcomplete, and the stateoftheart SAT solvers have recently been successfully applied to NPhard problems in planning and formal verification, the goal of this paper is to study whether these SAT solvers can be used to solve the bandwidth allocation problem exactly with acceptable overhead. We investigate alternative ways of encoding the allocation problem, and develop techniques for abstraction and refinement of network graphs for scalability. We report experimental comparisons of the proposed encodings with the existing heuristics for typical datacenter topologies.Formal Methods in ComputerAided Design (FMCAD), 2013; 01/2013 
Article: 2010 CAV award announcement
[Show abstract] [Hide abstract]
ABSTRACT: The 2010 CAV (ComputerAided Verification) award was awarded to Kenneth L. McMillan of Cadence Research Laboratories for a series of fundamental contributions resulting in significant advances in scalability of model checking tools. The annual award recognizes a specific fundamental contribution or a series of outstanding contributions to the CAV field.Formal Methods in System Design 08/2012; 40(2). DOI:10.1007/s1070301101251 · 0.40 Impact Factor
Publication Stats
22k  Citations  
60.49  Total Impact Points  
Top Journals
Institutions

1970–2015

University of Pennsylvania
 • Department of Computer and Information Science
 • Department of Electrical and Systems Engineering
Filadelfia, Pennsylvania, United States


2013

Indian Institute of Technology Bombay
Mumbai, Maharashtra, India


2006–2013

William Penn University
Filadelfia, Pennsylvania, United States


1996–2006

University of California, Berkeley
 Department of Electrical Engineering and Computer Sciences
Berkeley, CA, United States


1990–2006

Stanford University
 Department of Computer Science
Palo Alto, California, United States


2000

The University of Edinburgh
 Laboratory for Foundations of Computer Science (LFCS)
Edinburgh, Scotland, United Kingdom


1992–1997

AT&T Labs
Austin, Texas, United States
