Aws Albarghouthi’s research while affiliated with University of Wisconsin–Madison and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (92)


COSMOS: Predictable and Cost-Effective Adaptation of LLMs
  • Preprint
  • File available

April 2025

Jiayu Wang

·

Aws Albarghouthi

·

Large language models (LLMs) achieve remarkable performance across numerous tasks by using a diverse array of adaptation strategies. However, optimally selecting a model and adaptation strategy under resource constraints is challenging and often requires extensive experimentation. We investigate whether it is possible to accurately predict both performance and cost without expensive trials. We formalize the strategy selection problem for LLMs and introduce COSMOS, a unified prediction framework that efficiently estimates adaptation outcomes at minimal cost. We instantiate and study the capability of our framework via a pair of powerful predictors: embedding-augmented lightweight proxy models to predict fine-tuning performance, and low-sample scaling laws to forecast retrieval-augmented in-context learning. Extensive evaluation across eight representative benchmarks demonstrates that COSMOS achieves high prediction accuracy while reducing computational costs by 92.72% on average, and up to 98.71% in resource-intensive scenarios. Our results show that efficient prediction of adaptation outcomes is not only feasible but can substantially reduce the computational overhead of LLM deployment while maintaining performance standards.

Download


Checking Observational Correctness of Database Systems

April 2025

·

2 Reads

Proceedings of the ACM on Programming Languages

Lauren Pick

·

Amanda Xu

·

Ankush Desai

·

[...]

·

Aws Albarghouthi

Clients rely on database systems to be correct, which requires the system not only to implement transactions’ semantics correctly but also to provide isolation guarantees for the transactions. This paper presents a client-centric technique for checking both semantic correctness and isolation-level guarantees for black-box database systems based on observations collected from running transactions on these systems. Our technique verifies observational correctness with respect to a given set of transactions and observations for them, which holds iff there exists a possible correct execution of the transactions under a given isolation level that could result in these observations. Our technique relies on novel symbolic encodings of (1) the semantic correctness of database transactions in the presence of weak isolation and (2) isolation-level guarantees. These are used by the checker to query a Satisfiability Modulo Theories solver. We applied our tool Troubadour to verify observational correctness of several database systems, including PostgreSQL and an industrial system under development, in which the tool helped detect two new bugs. We also demonstrate that Troubadour is able to find known semantic correctness bugs and detect isolation-related anomalies.


Dependency-Aware Compilation for Surface Code Quantum Architectures

April 2025

·

1 Read

·

2 Citations

Proceedings of the ACM on Programming Languages

Practical applications of quantum computing depend on fault-tolerant devices with error correction. We study the problem of compiling quantum circuits for quantum computers implementing surface codes. Optimal or near-optimal compilation is critical for both efficiency and correctness. The compilation problem requires (1) mapping circuit qubits to the device qubits and (2) routing execution paths between interacting qubits. We solve this problem efficiently and near-optimally with a novel algorithm that exploits the dependency structure of circuit operations to formulate discrete optimization problems that can be approximated via simulated annealing , a classic and simple algorithm. Our extensive evaluation shows that our approach is powerful and flexible for compiling realistic workloads.



Verified Foundations for Differential Privacy

December 2024

·

5 Reads

Differential privacy (DP) has become the gold standard for privacy-preserving data analysis, but implementing it correctly has proven challenging. Prior work has focused on verifying DP at a high level, assuming the foundations are correct and a perfect source of randomness is available. However, the underlying theory of differential privacy can be very complex and subtle. Flaws in basic mechanisms and random number generation have been a critical source of vulnerabilities in real-world DP systems. In this paper, we present SampCert, the first comprehensive, mechanized foundation for differential privacy. SampCert is written in Lean with over 12,000 lines of proof. It offers a generic and extensible notion of DP, a framework for constructing and composing DP mechanisms, and formally verified implementations of Laplace and Gaussian sampling algorithms. SampCert provides (1) a mechanized foundation for developing the next generation of differentially private algorithms, and (2) mechanically verified primitives that can be deployed in production systems. Indeed, SampCert's verified algorithms power the DP offerings of Amazon Web Services (AWS), demonstrating its real-world impact. SampCert's key innovations include: (1) A generic DP foundation that can be instantiated for various DP definitions (e.g., pure, concentrated, R\'enyi DP); (2) formally verified discrete Laplace and Gaussian sampling algorithms that avoid the pitfalls of floating-point implementations; and (3) a simple probability monad and novel proof techniques that streamline the formalization. To enable proving complex correctness properties of DP and random number generation, SampCert makes heavy use of Lean's extensive Mathlib library, leveraging theorems in Fourier analysis, measure and probability theory, number theory, and topology.


Optimizing Quantum Circuits, Fast and Slow

November 2024

·

22 Reads

Optimizing quantum circuits is critical: the number of quantum operations needs to be minimized for a successful evaluation of a circuit on a quantum processor. In this paper we unify two disparate ideas for optimizing quantum circuits, rewrite rules, which are fast standard optimizer passes, and unitary synthesis, which is slow, requiring a search through the space of circuits. We present a clean, unifying framework for thinking of rewriting and resynthesis as abstract circuit transformations. We then present a radically simple algorithm, GUOQ, for optimizing quantum circuits that exploits the synergies of rewriting and resynthesis. Our extensive evaluation demonstrates the ability of GUOQ to strongly outperform existing optimizers on a wide range of benchmarks.


Figure 3: Marginal means for each multiplicity resolution technique. A score of 0.5 indicates the frequency the option would have been chosen if the selection is purely random, while a score greater (less) than 0.5 indicates an option is chosen more (less) frequently than would be expected by random chance.
Perceptions of the Fairness Impacts of Multiplicity in Machine Learning

September 2024

·

25 Reads

Machine learning (ML) is increasingly used in high-stakes settings, yet multiplicity -- the existence of multiple good models -- means that some predictions are essentially arbitrary. ML researchers and philosophers posit that multiplicity poses a fairness risk, but no studies have investigated whether stakeholders agree. In this work, we conduct a survey to see how the presence of multiplicity impacts lay stakeholders' -- i.e., decision subjects' -- perceptions of ML fairness, and which approaches to address multiplicity they prefer. We investigate how these perceptions are modulated by task characteristics (e.g., stakes and uncertainty). Survey respondents think that multiplicity lowers distributional, but not procedural, fairness, even though existing work suggests the opposite. Participants are strongly against resolving multiplicity by using a single good model (effectively ignoring multiplicity) or by randomizing over possible outcomes. Our results indicate that model developers should be intentional about dealing with multiplicity in order to maintain fairness.


Results of applying ARC-Tran to decoder-only Transformer on SST2 dataset.
A One-Layer Decoder-Only Transformer is a Two-Layer RNN: With an Application to Certified Robustness

May 2024

·

27 Reads

This paper reveals a key insight that a one-layer decoder-only Transformer is equivalent to a two-layer Recurrent Neural Network (RNN). Building on this insight, we propose ARC-Tran, a novel approach for verifying the robustness of decoder-only Transformers against arbitrary perturbation spaces. Compared to ARC-Tran, current robustness verification techniques are limited either to specific and length-preserving perturbations like word substitutions or to recursive models like LSTMs. ARC-Tran addresses these limitations by meticulously managing position encoding to prevent mismatches and by utilizing our key insight to achieve precise and scalable verification. Our evaluation shows that ARC-Tran (1) trains models more robust to arbitrary perturbation spaces than those produced by existing techniques and (2) shows high certification accuracy of the resulting models.



Citations (60)


... We mitigate this by running multiple parallel distillation processes for each magic state. The enhanced locality offers greater parallelism than existing parallel-computing schemes based on the conventional layout [8], [9], [11]. Moreover, our scheme is compatible with many routing and scheduling algorithms developed for conventional layouts, opening the door to even more efficient compilation strategies. ...

Reference:

Locality-aware Pauli-based computation for local magic state preparation
Dependency-Aware Compilation for Surface Code Quantum Architectures
  • Citing Article
  • April 2025

Proceedings of the ACM on Programming Languages

... The sources of multiplicity can be diverse, and multiplicity has been demonstrated across different dimensions, such as random seeds [29], target variables [128], different sparse decision trees [131] and the dataset generation process [91] as well as model design decisions [117]. Multiple measures have been proposed to quantify these effects [29,41,66]. ...

The Dataset Multiplicity Problem: How Unreliable Data Impacts Predictions
  • Citing Conference Paper
  • June 2023

... Optimisation of computation graphs is a long-standing problem in computer science that is seeing renewed interest in the compiler [12], machine learning (ML) [8,5] and quantum computing communities [20,19]. In all of these domains, graphs encode computations that are either expensive to execute or that are evaluated repeatedly over many iterations, making graph optimisation a primary concern. ...

Synthesizing Quantum-Circuit Optimizers
  • Citing Article
  • June 2023

Proceedings of the ACM on Programming Languages

... In addition, the cost of producing and maintaining humanoid robots is likely to decrease over time, due to advances in technology, which in the long term could make them a more sustainable option, also considering the increasing cost of human healthcare services. Some research efforts have even started to explore how non-technical users can personalise the behaviour of such robots through End-User Development approaches (Leonardi et al., 2019), (Porfirio et al., 2023). However, despite being quite a promising interactive technology, so far humanoid robots have received limited adoption for older adult assistance in real-world contexts (Carros et al., 2022) for various reasons, for example, it has been reported (Yuan et al., 2021) that a rehabilitation robot should be better able to adapt and personalise to the specific individual needs. ...

Sketching Robot Programs On the Fly
  • Citing Conference Paper
  • March 2023

... Cleaning tasks are united by the need to apply a tool to a stationary surface, yet each require different tool-use skills. HRI researchers have taken various approaches to collecting the information to support automation, including crowd-sourcing human explanations of task procedures [14,15] and end-user demonstration [16]. ...

Crowdsourcing Task Traces for Service Robotics
  • Citing Conference Paper
  • March 2023

... This is very different from finding adversarial examples, which are inputs slightly perturbed so that to lead to a different output. Finding adversarial examples and verifying robustness have attracted considerable attention in NNs (for example, [18], [19], [20]) as well as in randomforest classifiers (e.g., [21], [22]). ...

Proving Data-Poisoning Robustness in Decision Trees
  • Citing Article
  • January 2023

Communications of the ACM

... Therefore, we explore adaptable backdoor poisoning detection techniques for code watermarks [7,22,50]. Recently, spectral signatures (SS) [50] and activation clustering (AC) [7], initially proposed for eliminating backdoor poisoning attacks in computer vision tasks, have been widely applied to evaluate the effectiveness of backdoors and watermarks in code datasets [36,44]. While AC clusters the representations of the training samples into two partitions to distinguish the backdoor samples, SS computes an outlier score for each representation. ...

Backdoors in Neural Models of Source Code
  • Citing Conference Paper
  • August 2022

... Typically, challenges of the scheduling problem encompass two principal components: gate scheduling and initial qubit mapping. Prior research focusing on superconducting devices has underscored the profound impact of initial qubit allocation on the final quality of the circuit output [40,44,69,90]. This indicates the necessity of devising a high-quality initial mapping to optimize circuit performance. ...

Qubit Mapping and Routing via MaxSAT
  • Citing Conference Paper
  • October 2022

... We aim to derive a superset of the range of the network output. For this, we exploit the technique of Interval Bound Propagation (IBP), based on ideas from (Gowal et al., 2018;Wang et al., 2022;Gehr et al., 2018). Propagating the n 0dimensional box through the network leads to the superset of the range of the network output. ...

Interval universal approximation for neural networks
  • Citing Article
  • January 2022

Proceedings of the ACM on Programming Languages

... Existing work (Shi et al., 2020;Bonaert et al., 2021) for verifying transformer robustness primarily addresses l p norm perturbations of embeddings, which only captures word substitutions. However, ensuring language model robustness requires addressing a wider range of transformations like insertions, deletions, substitutions, swaps, and their combinations (Zhang et al., 2021(Zhang et al., , 2023. ARC (Zhang et al., 2021) is the only technique that verifies model robustness against arbitrary perturbation spaces, but it only works on recursive models like LSTMs. ...

Certified Robustness to Programmable Transformations in LSTMs
  • Citing Conference Paper
  • January 2021