Netanel Raviv’s research while affiliated with Washington University in St. Louis and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (109)


Access-Redundancy Tradeoffs in Quantized Linear Computations
  • Article

November 2024

·

7 Reads

·

1 Citation

IEEE Transactions on Information Theory

·

Netanel Raviv

·

Linear real-valued computations over distributed datasets are common in many applications, most notably as part of machine learning inference. In particular, linear computations that are quantized, i.e., where the coefficients are restricted to a predetermined set of values (such as ±1), have gained increasing interest lately due to their role in efficient, robust, or private machine learning models. Given a dataset to store in a distributed system, we wish to encode it so that all such computations could be conducted by accessing a small number of servers, called the access parameter of the system. Doing so relieves the remaining servers to execute other tasks. Minimizing the access parameter gives rise to an access-redundancy tradeoff, where a smaller access parameter requires more redundancy in the system, and vice versa. In this paper, we study this tradeoff and provide several explicit low-access schemes for {±1} quantized linear computations based on covering codes in a novel way. While the connection to covering codes has been observed in the past, our results strictly outperform the state-of-the-art for two-valued linear computations. We further show that the same storage scheme can be used to retrieve any linear combination with two distinct coefficients—regardless of what those coefficients are—with the same access parameter. This universality result is then extended to all possible quantizations with any number of values; while the storage remains identical, the access parameter increases according to a new additive-combinatorics property we call coefficient complexity . We then turn to study the coefficient complexity—we characterize the complexity of small sets of coefficients, provide bounds, and identify coefficient sets having the highest and lowest complexity. Interestingly, arithmetic progressions have the lowest possible complexity, and some geometric progressions have the highest possible complexity, the former being particularly attractive for its common use in uniform quantization.


Gram-Schmidt Methods for Unsupervised Feature Extraction and Selection
  • Preprint
  • File available

October 2024

·

26 Reads

Download


\varepsilon$-MSR Codes for Any Set of Helper Nodes

August 2024

·

14 Reads

Minimum storage regenerating (MSR) codes are a class of maximum distance separable (MDS) array codes capable of repairing any single failed node by downloading the minimum amount of information from each of the helper nodes. However, MSR codes require large sub-packetization levels, which hinders their usefulness in practical settings. This led to the development of another class of MDS array codes called ε\varepsilon-MSR codes, for which the repair information downloaded from each helper node is at most a factor of (1+ε)(1+\varepsilon) from the minimum amount for some ε>0\varepsilon > 0. The advantage of ε\varepsilon-MSR codes over MSR codes is their small sub-packetization levels. In previous constructions of epsilon-MSR codes, however, several specific nodes are required to participate in the repair of a failed node, which limits the performance of the code in cases where these nodes are not available. In this work, we present a construction of ε\varepsilon-MSR codes without this restriction. For a code with n nodes, out of which k store uncoded information, and for any number d of helper nodes (kd<nk\le d<n), the repair of a failed node can be done by contacting any set of d surviving nodes. Our construction utilizes group algebra techniques, and requires linear field size. We also generalize the construction to MDS array codes capable of repairing h failed nodes using d helper nodes with a slightly sub-optimal download from each helper node, for all hrh \le r and kdnhk \le d \le n-h simultaneously.


On the Encoding Process in Decentralized Systems

August 2024

·

2 Reads

We consider the problem of encoding information in a system of N=K+R processors that operate in a decentralized manner, i.e., without a central processor which orchestrates the operation. The system involves K source processors, each holding some data modeled as a vector over a finite field. The remaining R processors are sinks, and each of which requires a linear combination of all data vectors. These linear combinations are distinct from one sink processor to another, and are specified by a generator matrix of a systematic linear error correcting code. To capture the communication cost of decentralized encoding, we adopt a linear network model in which the process proceeds in consecutive communication rounds. In every round, every processor sends and receives one message through each one of its p ports. Moreover, inspired by linear network coding literature, we allow processors to transfer linear combinations of their own data and previously received data. We propose a framework that addresses the decentralized encoding problem on two levels. On the universal level, we provide a solution to the decentralized encoding problem for any possible linear code. On the specific level, we further optimize our solution towards systematic Reed-Solomon codes, as well as their variant, Lagrange codes, for their prevalent use in coded storage and computation systems. Our solutions are based on a newly-defined collective communication operation we call all-to-all encode.


Figure 1: (a) shows five tests with fixed δg = δi = 0.02: ϵ 1 g = 0.25 with ϵ 1 i = 0.05, ϵ 2 g = 0.25 with ϵ 2 i = 0.1, ϵ 3 g = 1.5 with ϵ 3 i = 0.3, ϵ 4 g = 3 with ϵ 4 i = 0.6, and ϵ 5 g = 5 with ϵ 5 i = 1, for i ∈ {2, 3, 4, 5}. (b) shows six tests with fixed δg = δi = 0.02: ϵ 1 g = 0.4 with ϵ 1 i = 0.05, ϵ 2 g = 0.6 with ϵ 2 i = 0.1, ϵ 3 g = 1 with ϵ 3 i = 0.18, ϵ 4 g = 2 with ϵ 4 i = 0.3, ϵ 5 g = 4 with ϵ 5 i = 0.6, and ϵ 5 g = 6 with ϵ 5 i = 1, for i ∈ {1, 2, 3, 4, 5}.
Confounding Privacy and Inverse Composition

August 2024

·

15 Reads

We introduce a novel privacy notion of (ϵ,δ\epsilon, \delta)-confounding privacy that generalizes both differential privacy and Pufferfish privacy. In differential privacy, sensitive information is contained in the dataset while in Pufferfish privacy, sensitive information determines data distribution. Consequently, both assume a chain-rule relationship between the sensitive information and the output of privacy mechanisms. Confounding privacy, in contrast, considers general causal relationships between the dataset and sensitive information. One of the key properties of differential privacy is that it can be easily composed over multiple interactions with the mechanism that maps private data to publicly shared information. In contrast, we show that the quantification of the privacy loss under the composition of independent (ϵ,δ\epsilon, \delta)-confounding private mechanisms using the optimal composition of differential privacy \emph{underestimates} true privacy loss. To address this, we characterize an inverse composition framework to tightly implement a target global (ϵg,δg\epsilon_{g}, \delta_{g})-confounding privacy under composition while keeping individual mechanisms independent and private. In particular, we propose a novel copula-perturbation method which ensures that (1) each individual mechanism i satisfies a target local (ϵi,δi\epsilon_{i}, \delta_{i})-confounding privacy and (2) the target global (ϵg,δg\epsilon_{g}, \delta_{g})-confounding privacy is tightly implemented by solving an optimization problem. Finally, we study inverse composition empirically on real datasets.






Citations (47)


... Moreover, when retrieving the data via sequencing, molecules are read in a random order, and many fragments are lost [8]. It can also be motivated by applications in fingerprinting and forensics, where one may wish to encode a serial number into a physical object (such as a weapon), which should be recoverable even from a small set of pieces left over from the original object [9,10]. ...

Reference:

Recovering a Message from an Incomplete Set of Noisy Fragments
Break-Resilient Codes for Forensic 3D Fingerprinting
  • Citing Conference Paper
  • July 2024

... This alters the internal structure of DNNs by adding redundant neurons and edges to increase reliability-a new middle layer is added. The authors of [17] proposed an approach that is complementary to other forms of defense and replaces the weights of individual neurons with robust analogs derived from the use of Fourier analytic tools. Additionally, the authors of [18] propose a new method called robustness-aware filter pruning (RFP) and utilize this filter pruning method to increase the robustness against adversarial attacks. ...

Enhancing Robustness of Neural Networks through Fourier Stabilization

... See [5] and [18] for a comprehensive survey of molecular communication systems and the role of the permutation channel in diffusion-based communication systems. An overview of coding challenges for DNA-based storage is presented in [25], while [24] presents an optimal code construction for correcting multiple errors in unordered string-based data encoding within DNA storage systems. See [23] for a comprehensive study of DNA-based storage systems. ...

Error Correction for DNA Storage
  • Citing Article
  • September 2023

IEEE BITS the Information Theory Magazine

... Furthermore, as blockchains encompass more nodes, alleviating load pressure has emerged as an essential priority. Without such measures, the network can quickly become overwhelmed [33,34]. Some studies have integrated the Merkle tree with other technical frameworks to enhance data storage efficiency and diminish system pressure. ...

Transaction Confirmation in Coded Blockchain
  • Citing Preprint
  • May 2023

... An interesting research direction has emerged at the crossroads of coded computation and federated learning, as evidenced by recent contributions such as those in [41], [42]. Moreover, the use case of coded computation in blockchains is studied in [43], [44]. These works focus on addressing challenges such as straggler mitigation and ensuring data privacy within federated settings. ...

Breaking Blockchain’s Communication Barrier with Coded Computation
  • Citing Conference Paper
  • November 2022

... In this paper we focus on linear computations over R, whose coefficients are quantized to a finite set of values (e.g., {±1}). Such computations have gained increasing attention of late, mostly for applications relating to machine learning inference, in which they have proven beneficial in terms of robustness [7,11,13] and privacy [12]. ...

Information Theoretic Private Inference in Quantized Models
  • Citing Conference Paper
  • June 2022

... However, perfect privacy is not helpful for data sharing, since the encoded data cannot be used for training a classifier which is the authorized use. Recently, in [38], [39], the notion of perfect sample privacy and perfect subset privacy were considered, which are shown to be attainable using instance encoding. These papers demonstrated the possibility of ensuring zero mutual information between the encoded samples and any subset of original samples with a constrained cardinality, while preserving the learnibility of the encoded dataset. ...

Perfect Subset Privacy for Data Sharing and Learning
  • Citing Conference Paper
  • June 2022

... Clearly, such specific algorithms are important only if they outperform universal ones, since by definition, every universal algorithm subsumes a specific algorithm for all systematic linear codes. In this paper, we are particularly interested in Reed-Solomon codes and Lagrange codes [2], [9], [12], for their prevalent use in distributed storage and computation systems. ...

Breaking Blockchain’s Communication Barrier With Coded Computation
  • Citing Article
  • June 2022

IEEE Journal on Selected Areas in Information Theory