Nadia Polikarpova's research while affiliated with University of California, San Diego and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (54)
AI-powered programming assistants are increasingly gaining popularity, with GitHub Copilot alone used by over a million developers worldwide. These tools are far from perfect, however, producing code suggestions that may be incorrect or incomplete in subtle ways. As a result, developers face a new set of challenges when they need to understand, val...
The Rust type system guarantees memory safety and data-race freedom. However, to satisfy Rust's type rules, many familiar implementation patterns must be adapted substantially. These necessary adaptations complicate programming and might hinder language adoption. In this paper, we demonstrate that, in contrast to manual programming, automatic synth...
Powered by recent advances in code-generating models, AI assistants like Github Copilot promise to change the face of programming forever. But what is this new face of programming? We present the first grounded theory analysis of how programmers interact with Copilot, based on observing 20 participants—with a range of prior experience using the ass...
Library learning compresses a given corpus of programs by extracting common structure from the corpus into reusable library functions. Prior work on library learning suffers from two limitations that prevent it from scaling to larger, more complex inputs. First, it explores too many candidate library functions that are not useful for compression. S...
Library learning compresses a given corpus of programs by extracting common structure from the corpus into reusable library functions. Prior work on library learning suffers from two limitations that prevent it from scaling to larger, more complex inputs. First, it explores too many candidate library functions that are not useful for compression. S...
Many problem domains, including program synthesis and rewrite-based optimization, require searching astronomically large spaces of programs. Existing approaches often rely on building specialized data structures—version-space algebras, finite tree automata, or e-graphs—to compactly represent such spaces. At their core, all these data structures exp...
Powered by recent advances in code-generating models, AI assistants like Github Copilot promise to change the face of programming forever. But what is this new face of programming? We present the first grounded theory analysis of how programmers interact with Copilot, based on observing 20 participants--with a range of prior experience using the as...
Many problem domains, including program synthesis and rewrite-based optimization, require searching astronomically large spaces of programs. Existing approaches often rely on building specialized data structures -- version-space algebras, finite tree automata, or e-graphs -- to compactly represent these programs. To find a compact representation, e...
With the rise of software-as-a-service and microservice architectures, RESTful APIs are now ubiquitous in mobile and web applications. A service can have tens or hundreds of API methods, making it a challenge for programmers to find the right combination of methods to solve their task. We present APIphany, a component-based synthesizer for programs...
One vision for program synthesis, and specifically for programming by example (PBE), is an interactive programmer's assistant, integrated into the development environment. To make program synthesis practical for interactive use, prior work on Small-Step Live PBE has proposed to limit the scope of synthesis to small code snippets, and enable the use...
Automated deductive program synthesis promises to generate executable programs from concise specifications, along with proofs of correctness that can be independently verified using third-party tools. However, an attempt to exercise this promise using existing proof-certification frameworks reveals significant discrepancies in how proof derivations...
This paper presents the main ideas behind deductive synthesis of heap-manipulating program and outlines present challenges faced by this approach as well as future opportunities for its applications.
Optimizing machine learning (ML) workloads on structured data is a key concern for data platforms. One class of optimizations called "factorized ML" helps reduce ML runtimes over multi-table datasets by pushing ML computations down through joins, avoiding the need to materialize such joins. The recent Morpheus system automated factorized ML to any...
A key challenge in program synthesis is the astronomical size of the search space the synthesizer has to explore. In response to this challenge, recent work proposed to guide synthesis using learned probabilistic models. Obtaining such a model, however, might be infeasible for a problem domain where no high-quality training data is available. In th...
We present Hoogle+, a web-based API discovery tool for Haskell. A Hoogle+ user can specify a programming task using either a type, a set of input-output tests, or both. Given a specification, the tool returns a list of matching programs composed from functions in popular Haskell libraries, and annotated with automatically-generated examples of thei...
A key challenge in program synthesis is the astronomical size of the search space the synthesizer has to explore. In response to this challenge, recent work proposed to guide synthesis using learned probabilistic models. Obtaining such a model, however, might be infeasible for a problem domain where no high-quality training data is available. In th...
This article presents liquid resource types, a technique for automatically verifying the resource consumption of functional programs. Existing resource analysis techniques trade automation for flexibility – automated techniques are restricted to relatively constrained families of resource bounds, while more expressive proof techniques admitting val...
We present Lifty, a domain-specific language for data-centric applications that manipulate sensitive data. A Lifty programmer annotates the sources of sensitive data with declarative security policies, and the language statically and automatically verifies that the application handles the data according to the policies. Moreover, if verification fa...
This article presents liquid resource types, a technique for automatically verifying the resource consumption of functional programs. Existing resource analysis techniques trade automation for flexibility -- automated techniques are restricted to relatively constrained families of resource bounds, while more expressive proof techniques admitting va...
In program synthesis there is a well-known trade-off between concise and strong specifications: if a specification is too verbose, it might be harder to write than the program; if it is too weak, the synthesised program might not match the user’s intent. In this work we explore the use of annotations for restricting memory access permissions in pro...
In program synthesis there is a well-known trade-off between concise and strong specifications: if a specification is too verbose, it might be harder to write than the program; if it is too weak, the synthesised program might not match the user's intent. In this work we explore the use of annotations for restricting memory access permissions in pro...
We consider the problem of type-directed component-based synthesis where, given a set of (typed) components and a query type, the goal is to synthesize a term that inhabits the query. Classical approaches based on proof search in intuitionistic logics do not scale up to the standard libraries of modern languages, which span hundreds or thousands of...
We consider the problem of type-directed component based synthesis where, given a set of (typed) components and a query type, the goal is to synthesize a term that inhabits the query. Classical approaches based on proof search in intuitionistic logics do not scale up to the standard libraries of modern languages, which span hundreds or thousands of...
Programmers frequently maintain implicit data invariants, which are relations between different data structures in a program. Traditionally, such invariants are manually enforced and checked by programmers. This ad-hoc practice is difficult because the programmer must manually account for all the locations and configurations that break an invariant...
This article presents resource-guided synthesis, a technique for synthesizing recursive programs that satisfy both a functional specification and a symbolic resource bound. The technique is type-directed and rests upon a novel type system that combines polymorphic refinement types with potential annotations of automatic amortized resource analysis....
This paper describes a deductive approach to synthesizing imperative programs with pointers from declarative specifications expressed in Separation Logic. Our synthesis algorithm takes as input a pair of assertions—a pre- and a postcondition—which describe two states of the symbolic heap, and derives a program that transforms one state into the oth...
This paper describes a deductive approach to synthesizing imperative programs with pointers from declarative specifications expressed in Separation Logic. Our synthesis algorithm takes as input a pair of assertions---a pre- and a postcondition---which describe two states of the symbolic heap, and derives a program that transforms one state into the...
Auto-active verifiers provide a level of automation intermediate between fully automatic and interactive: users supply code with annotations as input while benefiting from a high level of automation in the back-end. This paper presents AutoProof, a state-of-the-art auto-active verifier for object-oriented sequential programs with complex functional...
The comprehensive functionality and nontrivial design of realistic general-purpose container libraries pose challenges to formal verification that go beyond those of individual benchmark problems mainly targeted by the state of the art. We present our experience verifying the full functional correctness of EiffelBase2: a container library offering...
Recent work has proposed a promising approach to improving scalability of program synthesis by allowing the user to supply a syntactic template that constrains the space of potential programs. Unfortunately, creating templates often requires nontrivial effort from the user, which impedes the usability of the synthesizer. We present a solution to th...
We present Lifty, a language that uses type-driven program repair to enforce information flow policies. In Lifty, the programmer specifies a policy by annotating the source of sensitive data with a refinement type, and the system automatically inserts access checks necessary to enforce this policy across the code. This is a significant improvement...
We present a method for synthesizing recursive functions that provably satisfy a given specification in the form of a polymorphic refinement type. We observe that such specifications are particularly suitable for program synthesis for two reasons. First, they offer a unique combination of expressive power and decidability, which enables automatic v...
We present an algorithm for synthesizing recursive functions that provably
satisfy a given specification in the form of a refinement type. We show that
refinement types can be decomposed more effectively than other kinds of
specifications, which helps prune the space of candidate programs the
synthesizer has to consider. Our algorithm can automatic...
The comprehensive functionality and nontrivial design of realistic general- purpose container libraries pose challenges to formal verification that go beyond those of individual benchmark problems mainly targeted by the state of the art. We present our experience verifying the full functional correctness of Eiffel- Base2: a container library offeri...
Auto-active verifiers provide a level of automation intermediate between
fully automatic and interactive: users supply code with annotations as input
while benefiting from a high level of automation in the back-end. This paper
presents AutoProof, a state-of-the-art auto-active verifier for object-oriented
sequential programs with complex functional...
Modular reasoning about class invariants is challenging in the presence of
dependencies among collaborating objects that need to maintain global
consistency. This paper presents semantic collaboration: a novel methodology to
specify and reason about class invariants of sequential object-oriented
programs, which models dependencies between collabora...
When program verification fails, it is often hard to understand what went wrong in the absence of concrete executions that expose parts of the implementation or specification responsible for the failure. Automatic generation of such tests would require “executing” the complex specifications typically used for verification (with unbounded quantifica...
Calculational proofs—proofs by stepwise formula manipulation—are praised for their rigor, readability, and elegance. It seems desirable to reuse this style, often employed on paper, in the context of mechanized reasoning, and in particular, program verification.
This work leverages the power of SMT solvers to machine-check calculational proofs at t...
Experience with lightweight formal methods suggests that programmers are
willing to write specification if it brings tangible benefits to their usual
development activities. This paper considers stronger specifications and
studies whether they can be deployed as an incremental practice that brings
additional benefits without being unacceptably expe...
We propose a technique for verifying high-level security properties of cryptographic protocol implementations based on stepwise refinement. Our refinement strategy supports reasoning about abstract protocol descriptions in the symbolic model of cryptography and gradually concretizing them towards executable code. We have implemented the technique w...
This paper reports on the experiences with the program verification competition held during the FoVeOOS conference in October 2011. There were 6 teams participating in this competition. We discuss the three different challenges that were posed and the solutions developed by the teams. We conclude with a discussion about the value of such competitio...
We, the organizers and participants, report on our experiences from the 1st Verified Software Competition, held in August 2010 in Edinburgh at the VSTTE 2010 conference.
Reusable software components need expressive specifications. This paper
outlines a rigorous foundation to model-based contracts, a method to equip
classes with strong contracts that support accurate design, implementation, and
formal verification of reusable components. Model-based contracts
conservatively extend the classic Design by Contract with...
Where do contracts — specification elements embedded in exe- cutable code — come from? To produce them, should we rely on the programmers, on automatic tools, or some combination? Recent work, in particular the Daikon system, has shown that it is possible to infer some contracts automatically from program executions. The main incentive has been an...
Citations
... Our methodology draws inspiration from a widely known heuristic in problem solving, which involves the decomposition of the problem into manageable sub-problems (Egidi, 2006). This approach is valuable in software development (Charitsis et al, 2022), and particularly in working with generative AI models (Barke et al, 2023). Recent studies target to enable the decomposition ability of AI models by enhancing the prompt with a series of intermediate NL reasoning steps, namely chain of thought (Wei et al, 2022), tree of thought (Yao et al, 2023), and plan-and-solve prompting (Wang et al, 2023). ...
... Another promising method is learning a library of functions from previously solved problems. These functions are then reusable in an updated domain-specific language to solve more challenging problems (Hewitt, Le, and Tenenbaum 2020;Ellis et al. 2021Ellis et al. , 2018Cao et al. 2023;Bowers et al. 2023). ...
... ECTAs [23], [24] are another, related compact data structure that extends e-graphs, Version-Space Algebras [25], [26], and Finite Tree Automata [27], with the concept of "entanglement"; that is, some choices of terms from e-classes may depend on choices done in other e-classes. Since the backbone of ECTAs is quite similar to an e-graph, the colors extension is applicable to this domain as well. ...
... JLIBSKETCH [14] uses algebraic properties to represent the semantics of modules and is a key component of our implementation. (CL)S [2] and APIphany [8] use types to represent the behavior of components and can be used in tandem with specialized type-directed synthesizers. The key differences between our work and these tools is that MOSS provides two well-defined synthesis primitives that support composing multiple modules, rather than synthesizing just one implementation for one module. ...
Reference: Modular System Synthesis
... For example, "lexecuting" code could help detect bugs that manifest through obvious signs of misbehavior, such as runtime exceptions or assertion violations. Likewise, our approach could be used to validate code generated by code synthesis techniques [29] or generative language models [17,84]. Other potential applications include to check if and how a code change to modifies the observable behavior [67], and classical dynamic analyses, such as detecting security vulnerabilities via taint analysis. ...
Reference: LExecutor: Learning-Guided Execution
... Some interesting proposals emerged with synthesizing loopfree bitvector programs [42,55]. Recently, synthesizing heap manipulations [36,86,87,90,116] and data manipulations [112][113][114] have attracted a lot of attention. There have also been attempts at synthesizing randomized programs, like that for differential privacy [91,115]. ...
... That is, even we pass a specification to SuSLik, SuSLik may not produce a program satisfying the specification. According to Itzhaky et al. [5], different synthesis tasks benefit from different search parameters, and that we might need a mechanism to tune SuSLik 's search strategy for a given synthesis task. ...
Reference: Genetic Algorithm for Program Synthesis
... There is a large body of work on the synthesis of functional recursive programs. Various approaches have been proposed to synthesize functional recursive programs from input-output examples [Feser et al. 2015;Lubin et al. 2020;Osera and Zdancewic 2015], refinement types [Polikarpova et al. 2016], logical specifications [Itzhaky et al. 2021;Kneuss et al. 2013], and a reference implementation with desired type invariants [Farzan and Nicolet 2021]. In the following, we will mainly focus on the prior work of inductive synthesis of functional recursive programs. ...
... Constraint solving is a technique used as an enabling technology in many areas of formal verification and analysis, such as symbolic execution (Cadar et al. (2006); Godefroid et al. (2005); King (1976); Sen et al. (2013)), static analysis (Wang et al. (2017); Gulwani et al. (2008)), or synthesis (Gulwani et al. (2011); Osera (2019); Knoth et al. (2019)). For instance, in symbolic execution, feasibility of a path in a program is tested by creating a constraint that encodes the evolution of the values of variables on the given path and checking if it is satisfiable. ...
... That is because they enumerate all the candidate solutions in a certain order guaranteeing that the first solution found is minimal. On the other hand, other strategies such as stochastic search [38] or probabilistic model-based search [7,27] may find a solution which is not minimal. ...