Chapter

The Sketching Complexity of Pattern Matching

DOI: 10.1007/978-3-540-27821-4_24
Source: DBLP

ABSTRACT We address the problems of pattern matching and approximate pattern matching in the sketching model. We show that it is impossible
to compress the text into a small sketch and use only the sketch to decide whether a given pattern occurs in the text. We
also prove a sketch size lower bound for approximate pattern matching, and show it is tight up to a logarithmic factor.

0 Bookmarks
 · 
71 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivated by a problem of transmitting data over broadcast channels (BirkandKol, INFOCOM1998), we study the following coding problem: a sender communicates with n receivers R<sub>l</sub>,.., R<sub>n</sub>. He holds an input x isin {0, 1}<sub>n</sub> and wishes to broadcast a single message so that each receiver R<sub>i</sub> can recover the bit x<sub>i</sub>. Each R<sub>i</sub> has prior side information about x, induced by a directed graph G on n nodes; R<sub>i </sub> knows the bits of x in the positions {j | (i, j) is anedge of G}. We call encoding schemes that achieve this goal INDEX codes for {0, 1} <sup>n</sup> with side information graph G. In this paper we identify a measure on graphs, the minrank, which we conjecture to exactly characterize the minimum length of INDEX codes. We resolve the conjecture for certain natural classes of graphs. For arbitrary graphs, we show that the minrank bound is tight for both linear codes and certain classes of non-linear codes. For the general problem, we obtain a (weaker) lower bound that the length of an INDEX code for any graph G is at least the size of the maximum acyclic induced subgraph of G
    Foundations of Computer Science, 2006. FOCS '06. 47th Annual IEEE Symposium on; 11/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We consider the problem of testing whether a function f: {0,1}n ® {0,1}f: {\{0,1\}}^n \longrightarrow {\{0,1\}} is computable by a read-once, width-2 ordered binary decision diagram (OBDD), also known as a branching program. This problem has two variants: one where the variables must occur in a fixed, known order, and one where the variables are allowed to occur in an arbitrary order. We show that for both variants, any nonadaptive testing algorithm must make Ω(n) queries, and thus any adaptive testing algorithm must make Ω(logn) queries. We also consider the more general problem of testing computability by width-w OBDDs where the variables occur in a fixed order. We show that for any constant w ≥ 4, Ω(n) queries are required, resolving a conjecture of Goldreich [15]. We prove all of our lower bounds using a new technique of Blais, Brody, and Matulef [6], giving simple reductions from known hard problems in communication complexity to the testing problems at hand. Our result for width-2 OBDDs provides the first example of the power of this technique for proving strong nonadaptive bounds.
    04/2011: pages 320-331;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivated by a problem of transmitting supplemental data over broadcast channels (Birk and Kol, INFOCOM 1998), we study the following coding problem: a sender communicates with n receivers R<sub>1</sub>,..., R<sub>n</sub>. He holds an input x ∈ {0,01l}<sup>n</sup> and wishes to broadcast a single message so that each receiver Ri can recover the bit x<sub>i</sub>. Each R<sub>i</sub> has prior side information about x, induced by a directed graph Grain nodes; Ri knows the bits of a; in the positions {j | (i,j) is an edge of G}.G is known to the sender and to the receivers. We call encoding schemes that achieve this goal INDEXcodes for {0,1}<sup>n</sup> with side information graph G. In this paper we identify a measure on graphs, the minrank, which exactly characterizes the minimum length of linear and certain types of nonlinear INDEX codes. We show that for natural classes of side information graphs, including directed acyclic graphs, perfect graphs, odd holes, and odd anti-holes, minrank is the optimal length of arbitrary INDEX codes. For arbitrary INDEX codes and arbitrary graphs, we obtain a lower bound in terms of the size of the maximum acyclic induced subgraph. This bound holds even for randomized codes, but has been shown not to be tight.
    IEEE Transactions on Information Theory 04/2011; · 2.62 Impact Factor

Full-text

Download
0 Downloads