Rajeev Alur’s research while affiliated with University of Pennsylvania and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (405)


Scenario-based Compositional Verification of Autonomous Systems with Neural Perception
  • Preprint

April 2025

·

2 Reads

Christopher Watson

·

Rajeev Alur

·

Divya Gopinath

·

[...]

·

Corina S. Pasareanu

Recent advances in deep learning have enabled the development of autonomous systems that use deep neural networks for perception. Formal verification of these systems is challenging due to the size and complexity of the perception DNNs as well as hard-to-quantify, changing environment conditions. To address these challenges, we propose a probabilistic verification framework for autonomous systems based on the following key concepts: (1) Scenario-based Modeling: We decompose the task (e.g., car navigation) into a composition of scenarios, each representing a different environment condition. (2) Probabilistic Abstractions: For each scenario, we build a compact abstraction of perception based on the DNN's performance on an offline dataset that represents the scenario's environment condition. (3) Symbolic Reasoning and Acceleration: The abstractions enable efficient compositional verification of the autonomous system via symbolic reasoning and a novel acceleration proof rule that bounds the error probability of the system under arbitrary variations of environment conditions. We illustrate our approach on two case studies: an experimental autonomous system that guides airplanes on taxiways using high-dimensional perception DNNs and a simulation model of an F1Tenth autonomous car using LiDAR observations.


Figure 1: Program decomposition for sum 4 . ϕ 1 computes the sum of 2 digits, and ϕ 2 computes the sum of results from the first sums.
Figure 4: Accuracy vs. Time for add 15 for IndeCateR, A-NeSI, and CTSketch.
Hyperparameters used in CTSketch for the benchmark tasks
CTSketch: Compositional Tensor Sketching for Scalable Neurosymbolic Learning
  • Preprint
  • File available

March 2025

·

2 Reads

Many computational tasks benefit from being formulated as the composition of neural networks followed by a discrete symbolic program. The goal of neurosymbolic learning is to train the neural networks using only end-to-end input-output labels of the composite. We introduce CTSketch, a novel, scalable neurosymbolic learning algorithm. CTSketch uses two techniques to improve the scalability of neurosymbolic inference: decompose the symbolic program into sub-programs and summarize each sub-program with a sketched tensor. This strategy allows us to approximate the output distribution of the program with simple tensor operations over the input distributions and summaries. We provide theoretical insight into the maximum error of the approximation. Furthermore, we evaluate CTSketch on many benchmarks from the neurosymbolic literature, including some designed for evaluating scalability. Our results show that CTSketch pushes neurosymbolic learning to new scales that have previously been unattainable by obtaining high accuracy on tasks involving over one thousand inputs.

Download


Figure 1: Programs in VIEIRA using foundation models.
Figure 2: Snippet of Python implementation of the foreign attribute clip which uses the CLIP model for image classification. Notice that the FA clip returns the FP run_clip.
Figure 9: Face Tagging (OFCP) exemplars.
The performance on the natural language reasoning datasets. Numbers are in percentage (%).
Relational Programming with Foundation Models

December 2024

·

3 Reads

Foundation models have vast potential to enable diverse AI applications. The powerful yet incomplete nature of these models has spurred a wide range of mechanisms to augment them with capabilities such as in-context learning, information retrieval, and code interpreting. We propose Vieira, a declarative framework that unifies these mechanisms in a general solution for programming with foundation models. Vieira follows a probabilistic relational paradigm and treats foundation models as stateless functions with relational inputs and outputs. It supports neuro-symbolic applications by enabling the seamless combination of such models with logic programs, as well as complex, multi-modal applications by streamlining the composition of diverse sub-models. We implement Vieira by extending the Scallop compiler with a foreign interface that supports foundation models as plugins. We implement plugins for 12 foundation models including GPT, CLIP, and SAM. We evaluate Vieira on 9 challenging tasks that span language, vision, and structured and vector databases. Our evaluation shows that programs in Vieira are concise, can incorporate modern foundation models, and have comparable or better accuracy than competitive baselines.


Figure 3: The inference accuracy of different learned reasoners at t = 1, 2, 3 autoregressive steps (left, center, right) over a median of 5 random seeds. We report the rate at which all n coordinates of a predicted state match its label. The accuracy is high for embedding dimensions d ≥ 2n, which shows that our theory-based configuration of d = 2n can realistically attain good performance.
Figure 8: Two examples of rule suppression with GPT-2 on the Minecraft dataset: the suppressed tokens receive less attention when the adversarial suffix is present. We apply appropriate paddings and show the difference between the attention weights of the attacked (with suffix) and the non-attacked (without suffix) generations, with appropriate padding applied. The attacked generation places less attention on the red positions and greater attention on the blue positions.
Figure 10: Example of rule suppression with Llama-2 on our custom dataset (Fig. 9). When attacked (left), the suppressed tokens receive less attention than in the non-attacked case (right). Rather than showing the difference of attention weights as in Fig. 8, this plot shows both the attacked and non-attacked attentions.
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

June 2024

·

17 Reads

We study how to subvert language models from following the rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form "if P and Q, then R" for some propositions P, Q, and R. We prove that although transformers can faithfully abide by such rules, maliciously crafted prompts can nevertheless mislead even theoretically constructed models. Empirically, we find that attacks on our theoretical models mirror popular attacks on large language models. Our work suggests that studying smaller theoretical models can help understand the behavior of large language models in rule-based settings like logical reasoning and jailbreak attacks.


Figure 1: Neural program decomposition for scene recognition.
Performance comparisons for sum 8 , sum 12 , and sum 16 with different sample counts k.
Performance comparison for sum 8 with different sample counts k.
Performance comparison for sum 12 with different sample counts k.
Performance comparison for sum 16 with different sample counts k.
Data-Efficient Learning with Neural Programs

June 2024

·

26 Reads

Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymolic learning literature. Our evaluation shows that for the latter benchmarks, ISED has comparable performance to state-of-the-art neurosymbolic frameworks. For the former, we use adaptations of prior work on gradient approximations of black-box components as a baseline, and show that ISED achieves comparable accuracy but in a more data- and sample-efficient manner.


Relational Programming with Foundational Models

March 2024

·

5 Reads

·

1 Citation

Proceedings of the AAAI Conference on Artificial Intelligence

Foundation models have vast potential to enable diverse AI applications. The powerful yet incomplete nature of these models has spurred a wide range of mechanisms to augment them with capabilities such as in-context learning, information retrieval, and code interpreting. We propose Vieira, a declarative framework that unifies these mechanisms in a general solution for programming with foundation models. Vieira follows a probabilistic relational paradigm and treats foundation models as stateless functions with relational inputs and outputs. It supports neuro-symbolic applications by enabling the seamless combination of such models with logic programs, as well as complex, multi-modal applications by streamlining the composition of diverse sub-models. We implement Vieira by extending the Scallop compiler with a foreign interface that supports foundation models as plugins. We implement plugins for 12 foundation models including GPT, CLIP, and SAM. We evaluate Vieira on 9 challenging tasks that span language, vision, and structured and vector databases. Our evaluation shows that programs in Vieira are concise, can incorporate modern foundation models, and have comparable or better accuracy than competitive baselines.



Relational Query Synthesis ⋈ Decision Tree Learning

December 2023

·

24 Reads

Proceedings of the VLDB Endowment

We study the problem of synthesizing a core fragment of relational queries called select-project-join (SPJ) queries from input-output examples. Search-based synthesis techniques are suited to synthesizing projections and joins by navigating the network of relational tables but require additional supervision for synthesizing comparison predicates. On the other hand, decision tree learning techniques are suited to synthesizing comparison predicates when the input database can be summarized as a single labelled relational table. In this paper, we adapt and interleave methods from the domains of relational query synthesis and decision tree learning, and present an end-to-end framework for synthesizing relational queries with categorical and numerical comparison predicates. Our technique guarantees the completeness of the synthesis procedure and strongly encourages minimality of the synthesized program. We present Libra, an implementation of this technique and evaluate it on a benchmark suite of 1,475 instances of queries over 159 databases with multiple tables. Libra solves 1,361 of these instances in an average of 59 seconds per instance. It outperforms state-of-the-art program synthesis tools Scythe and PatSQL in terms of both the running time and the quality of the synthesized programs.


Mobius: Synthesizing Relational Queries with Recursive and Invented Predicates

October 2023

·

9 Reads

·

6 Citations

Proceedings of the ACM on Programming Languages

Synthesizing relational queries from data is challenging in the presence of recursion and invented predicates. We propose a fully automated approach to synthesize such queries. Our approach comprises of two steps: it first synthesizes a non-recursive query consistent with the given data, and then identifies recursion schemes in it and thereby generalizes to arbitrary data. This generalization is achieved by an iterative predicate unification procedure which exploits the notion of data provenance to accelerate convergence. In each iteration of the procedure, a constraint solver proposes a candidate query, and a query evaluator checks if the proposed program is consistent with the given data. The data provenance for a failed query allows us to construct additional constraints for the constraint solver and refine the search. We have implemented our approach in a tool named Mobius. On a suite of 21 challenging recursive query synthesis tasks, Mobius outperforms three state-of-the-art baselines Gensynth, ILASP, and Popper, both in terms of runtime and accuracy. We also demonstrate that the synthesized queries generalize well to unseen data.


Citations (64)


... Large Language Models (LLMs) have been widely adopted for cybersecurity threat analysis and deployment of mitigations in enterprise systems [4]- [14]. They have been particularly effective in cyber threat intelligence analysis [7], [8], LLM-assisted attacks [6], [9], [10], log-based anomaly detection [11], [14], and vulnerability detection [4], [5], [12], [13]. ...

Reference:

LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis
Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities
  • Citing Conference Paper
  • March 2025

... Unfortunately, they require users to program in imperative languages such as Python or TypeScript. At the other end of the spectrum, frameworks such as DSPy [15] and Vieira [19] avoid hand-written prompts altogether by automatically generating them. Unfortunately, this takes away even more control from the developer. ...

Relational Programming with Foundational Models
  • Citing Article
  • March 2024

Proceedings of the AAAI Conference on Artificial Intelligence

... Finally, work on the quadratic LMI scalability issue is needed. One could exploit sparsity-using chordal decomposition [46] or use block-diagonal factorization so each layer is verified with a much smaller LMI that can be solved in parallel [47]. ...

Chordal sparsity for SDP-based neural network verification
  • Citing Article
  • March 2024

Automatica

... In program synthesis, a candidate program that fails to meet the specification can be viewed as a conflict. Various algorithms [23,44,61,70] have been proposed to generalize such a conflict to unseen programs that would fail due to the same reason. Our work is distinct in a few ways: we perform conflict-driven search over under-approximations, our conflict is generalized to new ones in a way that takes advantage of a predefined lattice structure of UAs, and we aim to generate inputs for SQL queries to meet a given property. ...

Mobius: Synthesizing Relational Queries with Recursive and Invented Predicates
  • Citing Article
  • October 2023

Proceedings of the ACM on Programming Languages

... Our approach falls into the category of provably safe RL (PSRL) techniques [7,16], treating safety as a hard constraint that must never be violated. This is in contrast to statistically safe RL techniques, which provide only statistical bounds on the system's safety by constraining the training objectives [3,19,20,21,22]. These soft guarantees, however, are insufficient for domains like autonomous driving, where each failure can be catastrophic. ...

Policy Synthesis and Reinforcement Learning for Discounted LTL

Lecture Notes in Computer Science

... A Series-parallel graph can be recursively constructed by observing that a single edge is a series-parallel graph, and by composing smaller series-parallel graphs either in series or in parallel. Although this class has been introduced a long time ago [16], it still attracts the attention of researchers (see, e.g., [2,3,10,15,27]). Series-parallel graphs are a well-known and studied graph class from a theoretical perspective and naturally model two-terminal networks that are constructed with the series and parallel composition. ...

A Robust Theory of Series Parallel Graphs
  • Citing Article
  • January 2023

Proceedings of the ACM on Programming Languages

... In [16], the authors address several problems that arise when applying microservices-based architectures on serverless platforms, including managing state, concurrency control support, and achieving deterministic execution. This article introduces the µ2sls system. ...

Executing Microservice Applications on Serverless, Correctly
  • Citing Article
  • January 2023

Proceedings of the ACM on Programming Languages

... Existing hard stability certification methods rely on a classifier's Lipschitz constant, which measures its sensitivity to input perturbations. While this quantity is useful for robustness analysis [13], it is often intractable to compute [54] and difficult to approximate [18,57]. To address this, Xue et al. [58] proposed Figure 4: The SCA certification algorithm. ...

Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural Networks
  • Citing Conference Paper
  • December 2022

... Prior work has sought to formulate an RL agent's mission as a logical formula in some temporal logic, typically Linear-time Temporal Logic (LTL) [11], or (much less frequently) in Probabilistic Computation Tree Logic (PCTL) [18]. That is, in those works, there is no reward, only a logical formula that encodes the mission, and algorithms are given for computing a policy that produces formula-satisfying behavior(s) [5]. LTL over MDPs is used in [16] to reason about normative conflicts; we adopt PCTL as a better suited formalism for stochastic systems, and are not (yet) concerned with normative conflicts. ...

A Framework for Transforming Specifications in Reinforcement Learning
  • Citing Chapter
  • December 2022

Lecture Notes in Computer Science