Moshe Schwartz's research while affiliated with Ben-Gurion University of the Negev and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (147)
Motivated by applications to DNA-storage, flash memory, and magnetic recording, we study perfect burst-correcting codes for the limited-magnitude error channel. These codes are lattices that tile the integer grid with the appropriate error ball. We construct two classes of such perfect codes correcting a single burst of length $2$ for $(1,0)$-limit...
We prove a new lower bound on the field size of locally repairable codes (LRCs). Additionally, we construct maximally recoverable (MR) codes which are cyclic. While a known construction for MR codes has the same parameters, it produces non-cyclic codes. Furthermore, we prove necessary and sufficient conditions that specify when the known non-cyclic...
Motivated by DNA storage in living organisms, and by known biological mutation processes, we study the reverse-complement string-duplication system. We fully classify the conditions under which the system has full expressiveness, for all alphabets and all fixed duplication lengths. We then focus on binary systems with duplication length $2$ and pro...
Error-correcting codes over sets, with applications to DNA storage, are studied. The DNA-storage channel receives a set of sequences, and produces a corrupted version of the set, including sequence loss, symbol substitution, symbol insertion/deletion, and limited-magnitude errors in symbols. Various parameter regimes are studied. New bounds on code...
We study whether an asymmetric limited-magnitude ball may tile Zn. This ball generalizes previously studied shapes: crosses, semi-crosses, and quasi-crosses. Such tilings act as perfect error-correcting codes in a channel which changes a transmitted integer vector in a bounded number of entries by limited-magnitude errors.
A construction of lattice...
Motivated by an application to database linear querying, such as private information-retrieval protocols, we suggest a fundamental property of linear codes – the generalized covering radius. The generalized covering-radius hierarchy of a linear code characterizes the trade-off between storage amount, latency, and access complexity, in such database...
Motivated by applications to DNA storage, we study reconstruction and list-reconstruction schemes for integer vectors that suffer from limited-magnitude errors. We characterize the asymptotic size of the intersection of error balls in relation to the code's minimum distance. We also devise efficient reconstruction algorithms for various limited-mag...
We study generalized covering radii, a fundamental property of linear codes that characterizes the trade-off between storage, latency, and access in linear data-query protocols such as PIR. We prove lower and upper bounds on the generalized covering radii of Reed-Muller codes, as well as finding their exact value in certain extreme cases. With the...
We construct integer error-correcting codes and covering codes for the limited-magnitude error channel with more than one error. The codes are lattices that pack or cover the space with the appropriate error ball. Some of the constructions attain an asymptotic packing/covering density that is constant. The results are obtained via various methods,...
We propose a list-decoding scheme for reconstruction codes in the context of uniform-tandem-duplication noise, which can be viewed as an application of the associative memory model to this setting. We find the uncertainty associated with
$m>2$
strings (where a previous paper considered
$m=2$
) in asymptotic terms, where code-words are taken fro...
We study scalar-linear and vector-linear solutions of the generalized combination network. We derive new upper and lower bounds on the maximum number of nodes in the middle layer, depending on the network parameters and the alphabet size. These bounds improve and extend the parameter range of known bounds. Using these new bounds we present a lower...
We study permutations over the set of $\ell$-grams, that are feasible in the sense that there is a sequence whose $\ell$-gram frequency has the same ranking as the permutation. Codes, which are sets of feasible permutations, protect information stored in DNA molecules using the rank-modulation scheme, and read using the shotgun sequencing technique...
Motivated by an application to database linear querying, such as private information-retrieval protocols, we suggest a fundamental property of linear codes -- the generalized covering radius. The generalized covering-radius hierarchy of a linear code characterizes the trade-off between storage amount, latency, and access complexity, in such databas...
Linear codes over finite extension fields have widespread applications in theory and practice. In some scenarios, the decoder has a sequential access to the codeword symbols, giving rise to a hierarchical erasure structure. In this paper we develop a mathematical framework for hierarchical erasures over extension fields, provide several bounds and...
We construct maximally recoverable codes (corresponding to partial MDS codes) which are based on linearized Reed-Solomon codes. The new codes have a smaller field size requirement compared with known constructions. For certain asymptotic regimes, the constructed codes have order-optimal alphabet size, asymptotically matching the known lower bound.
We study the Singleton-type bound that provides an upper limit on the minimum distance of locally repairable codes. We present an improved bound by carefully analyzing the combinatorial structure of the repair sets. Thus, we show the previous bound is unachievable for certain parameters. We then also provide explicit constructions of optimal codes...
Error-correcting codes over sets, with applications to DNA storage, are studied. The DNA-storage channel receives a set of sequences, and produces a corrupted version of the set, including sequence loss, symbol substitution, symbol insertion/deletion, and limited-magnitude errors in symbols. Various parameter regimes are studied. New bounds on code...
Motivated by mutation processes occurring in in-vivo DNA-storage applications, a channel that mutates stored strings by duplicating substrings as well as substituting symbols is studied. Two models of such a channel are considered: one in which the substitutions occur only within the duplicated substrings, and one in which the location of substitut...
We study scalar-linear and vector-linear solutions of the generalized combination network. We derive new upper and lower bounds on the maximum number of nodes in the middle layer, depending on the network parameters and the alphabet size. These bounds improve and extend the parameter range of known bounds. Using these new bounds we present a lower...
We study whether an asymmetric limited-magnitude ball may tile $\mathbb{Z}^n$. This ball generalizes previously studied shapes: crosses, semi-crosses, and quasi-crosses. Such tilings act as perfect error-correcting codes in a channel which changes a transmitted integer vector in a bounded number of entries by limited-magnitude errors. A constructio...
We construct integer error-correcting codes and covering codes for the limited-magnitude error channel with more than one error. The codes are lattices that pack or cover the space with the appropriate error ball. Some of the constructions attain an asymptotic packing/covering density that is constant. The results are obtained via various methods,...
Minimal multicast networks are fascinating and efficient combinatorial objects, where the removal of a single link makes it impossible for all receivers to obtain all messages. We study the structure of such networks, and prove some constraints on their possible solutions. We then focus on the combination network, which is one of the simplest and m...
A method for encoding information in DNA sequences is described. The method is based on the precision-resolution framework, and is aimed to work in conjunction with a recently suggested terminator-free template independent DNA synthesis method. The suggested method optimizes the amount of information bits per synthesis time unit, namely, the writin...
A method for encoding information in DNA sequences is described. The method is based on the precision-resolution framework, and is aimed to work in conjunction with a recently suggested terminator-free template independent DNA synthesis method. The suggested method optimizes the amount of information bits per synthesis time unit, namely, the writin...
In this paper, locally repairable codes which have optimal minimum Hamming distance with respect to the bound presented by Prakash et al. are considered. New upper bounds on the length of such optimal codes are derived. The new bounds apply to more general cases, and have weaker requirements compared with the known ones. In this sense, they both im...
A growing number of works have, in recent years, been concerned with in-vivo DNA as medium for data storage. This paper extends the concept of reconstruction codes for uniform-tandem-duplication noise to the model of associative memory, by finding the uncertainty associated with $m>2$ strings (where a previous paper considered $m=2$). That uncertai...
We study scalar-linear and vector-linear solutions to the generalized combination network. We derive new upper and lower bounds on the maximum number of nodes in the middle layer, depending on the network parameters. These bounds improve and extend the parameter range of known bounds. Using these new bounds we present a general lower bound on the g...
Optimal locally repairable codes with information locality are considered. Optimal codes are constructed, whose length is also order-optimal with respect to a new bound on the code length derived in this paper. The length of the constructed codes is super-linear in the alphabet size, which improves upon the well known pyramid codes, whose length is...
Motivated by mutation processes occurring in in-vivo DNA-storage applications, a channel that mutates stored strings by duplicating substrings as well as substituting symbols is studied. Two models of such a channel are considered: one in which the substitutions occur only within the duplicated substrings, and one in which the location of substitut...
We study restricted permutations of sets which have a geometrical structure. The study of restricted permutations is motivated by their application in coding for flash memories, and their relevance in different applications of networking technologies and various channels. We generalize the model of $\mathbb{Z}^d$-permutations with restricted moveme...
Genomic evolution can be viewed as string-editing processes driven by mutations. An understanding of the statistical properties resulting from these mutation processes is of value in a variety of tasks related to biological sequence data, e.g., estimation of model parameters and compression. At the same time, due to the complexity of these processe...
Locally repairable codes are desirable for distributed storage systems to improve the repair efficiency. In this paper, a new combination of codes with locality and codes with multiple disjoint repair sets (also called availability) is introduced. Accordingly, a Singleton-type bound is derived for the new code, which contains those bounds in [9], [...
DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction problem, which is applicable whenever multiple reads of noisy data are available. This strategy is uniquely suited...
We study random string-duplication systems, which we call Pólya string models. These are motivated by a class of mutations that are common in most organisms and lead to an abundance of repeated sequences in their genomes. Unlike previous works that study the combinatorial capacity of string-duplication systems, or in a probabilistic setting, variou...
We study random string-duplication systems, which we call Pólya string models. These are motivated by a class of mutations that are common in most organisms and lead to an abundance of repeated sequences in their genomes. Unlike previous works that study the combinatorial capacity of string-duplication systems, or in a probabilistic setting, variou...
The generalization of De Bruijn sequences to infinite sequences with respect to the order $n$ has been studied iand it was shown that every de Bruijn sequence of order $n$ in at least three symbols can be extended to a de Bruijn sequence of order $n + 1$. Every de Bruijn sequence of order $n$ in two symbols can not be extended to order $n + 1$, but...
Background
Tandem repeat sequences are common in the genomes of many organisms and are known to cause important phenomena such as gene silencing and rapid morphological changes. Due to the presence of multiple copies of the same pattern in tandem repeats and their high variability, they contain a wealth of information about the mutations that have...
The combination network is one of the simplest and insightful networks in coding theory. The vector network coding solutions for this network and some of its sub-networks are examined. For a fixed alphabet size of a vector network coding solution, an upper bound on the number of nodes in the network is obtained. This bound is an MDS bound for subsp...
Synthesis of DNA molecules offers unprecedented advances in storage technology. Yet, the microscopic world in which these molecules reside induces error patterns that are fundamentally different from their digital counterparts. Hence, to maintain reliability in reading and writing, new coding schemes must be developed. In a reading technique called...
Locally repairable codes which are optimal with respect to the bound presented by Prakash et al. are considered. New upper bounds on the length of such optimal codes are derived. The new bounds both improve and generalize previously known bounds. Optimal codes are constructed, whose length is order-optimal when compared with the new upper bounds. T...
Genomic evolution can be viewed as string-editing processes driven by mutations. An understanding of the statistical properties resulting from these mutation processes is of value in a variety of tasks related to biological sequence data, e.g., estimation of model parameters and compression. At the same time, due to the complexity of these processe...
We study array codes which are based on subspaces of a linear space over a finite field, using spreads, q-Steiner systems, and subspace transversal designs. We present several constructions of such codes which are q-analogs of some known block codes such as the Hamming and simplex codes.We examine the locality and availability of the constructed co...
We study random string-duplication systems, which we call P\'olya string models. These are motivated by DNA storage in living organisms, and certain random mutation processes that affect their genome. Unlike previous works that study the combinatorial capacity of string-duplication systems, or various string statistics, this work provides exact cap...
Recent advances in coding for distributed storage systems have reignited the interest in scalar codes over extension fields. In parallel, the rise of large-scale distributed systems has motivated the study of computing in the presence of stragglers, i.e., servers that are slow to respond or unavailable. This paper addresses storage systems that emp...
Synthesis of DNA molecules offers unprecedented advances in storage technology. Yet, the microscopic world in which these molecules reside induces error patterns that are fundamentally different from their digital counterparts. Hence, to maintain reliability in reading and writing, new coding schemes must be developed. In a reading technique called...
Private information retrieval has been reformulated in an information-theoretic perspective in recent years. The two most important parameters considered for a PIR scheme in a distributed storage system are the storage overhead and PIR rate. We take into consideration a third parameter, the access complexity of a PIR scheme, which characterizes the...
Recent advances in coding for distributed storage systems have reignited the interest in scalar codes over extension fields. In parallel, the rise of large-scale distributed systems has motivated the study of computing in the presence of stragglers, i.e., servers that are slow to respond or unavailable. This paper addresses storage systems that emp...
DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction problem, which is applicable whenever multiple reads of noisy data are available. This strategy is uniquely suited...
We consider the communication of information in the presence of synchronization errors. Specifically, we consider permutation channels in which a transmitted codeword x = (x
<sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub>
, ... , x
<sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink...
We find a new formula for the limit of the capacity of certain sequences of multidimensional semiconstrained systems as the dimension tends to infinity. We do so by generalizing the notion of independence entropy, originally studied in the context of constrained systems, to the study of semiconstrained systems. Using the independence entropy, we ob...
Synthesis of DNA molecules offers unprecedented advances in storage technology. Yet, the microscopic world in which these molecules reside induces error patterns that are fundamentally different from their digital counterparts. Hence, to maintain reliability in reading and writing, new coding schemes must be developed. In a reading technique called...
We construct Gray codes over permutations for the rank-modulation scheme,
which are also capable of correcting errors under the infinity-metric. These
errors model limited-magnitude or spike errors, for which only
single-error-detecting Gray codes are currently known. Surprisingly, the
error-correcting codes we construct achieve better asymptotic r...
A shift rule for the prefer-max De Bruijn sequence is formulated, for all sequence orders, and over any finite alphabet. An efficient algorithm for this shift rule is presented, which has linear (in the sequence order) time and memory complexity.
Duplication mutations play a critical role in the generation of biological sequences. Simultaneously, they have a deleterious effect on data stored using in-vivo DNA data storage. While duplications have been studied both as a sequence-generation mechanism and in the context of error correction, for simplicity these studies have not taken into acco...
The problem of one-way file synchronization, henceforth called “file updates”, is studied in this work. Specifically, a client edits a file, where the edits are modeled by insertions and deletions (InDels). An old copy of the file is stored remotely at a data-centre, and is also available to the client. We consider the problem of throughput- and co...
We study covering codes of permutations with the ℓ∞-metric. We provide a general code construction, which combines short building-block codes into a single long code. We focus on cyclic transitive groups as building blocks, determining their exact covering radius, and showing a linear-time algorithm for finding a covering codeword. When used in the...
Ever-increasing amounts of data are created and processed in internet-scale companies such as Google, Facebook, and Amazon. The efficient storage of such copious amounts of data has thus become a fundamental and acute problem in modern computing. No single machine can possibly satisfy such immense storage demands. Therefore, distributed storage sys...
We prove that there exist non-linear binary cyclic codes that attain the Gilbert-Varshamov bound.
Duplication mutations play a critical role in the generation of biological sequences. Simultaneously, they have a deleterious effect on data stored using in-vivo DNA data storage. While duplications have been studied both as a sequence-generation mechanism and in the context of error correction, for simplicity these studies have not taken into acco...
We study the size (or volume) of balls in the metric space of permutations, Sn, under the infinity metric. We focus on the regime of balls with radius r = r · (n−1), r ∈ [0, 1], i.e., a radius that is a constant fraction of the maximum possible distance. We provide new lower bounds on the size of such balls. These new lower bounds reduce the asympt...
We study random string-duplication systems, called Pólya string models, motivated by certain random mutation processes in the genome of living organisms. Unlike previous works that study the combinatorial capacity of string-duplication systems, or peripheral properties such as symbol frequency, this work provides exact capacity or bounds on it, for...
The ability to store data in the DNA of a living organism has applications in a variety of areas including synthetic biology and watermarking of patented genetically-modified organisms. Data stored in this medium is subject to errors arising from various mutations, such as point mutations, indels, and tandem duplication, which need to be corrected...
The ability to store data in the DNA of a living
organism has applications in a variety of areas including synthetic biology and watermarking of patented genetically-modified organisms. Data stored in this medium is subject to errors arising from various mutations, such as point mutations, indels, and tandem duplication, which need to be corrected...