[Show abstract][Hide abstract] ABSTRACT: In the context of language recognition, we demonstrate the superiority of
streaming property testers against streaming algorithms and property testers,
when they are not combined. Initiated by Feigenbaum et al, a streaming property
tester is a streaming algorithm recognizing a language under the property
testing approximation: it must distinguish inputs of the language from those
that are $\varepsilon$-far from it, while using the smallest possible memory
(rather than limiting its number of input queries).
Our main result is a streaming $\varepsilon$-property tester for visibly
pushdown languages (VPL) with one-sided error using memory space
$\mathrm{poly}((\log n) / \varepsilon)$.
This constructions relies on a new (non-streaming) property tester for
weighted regular languages based on a previous tester by Alon et al. We provide
a simple application of this tester for streaming testing special cases of
instances of VPL that are already hard for both streaming algorithms and
property testers.
Our main algorithm is a combination of an original simulation of visibly
pushdown automata using a stack with small height but possible items of linear
size. In a second step, those items are replaced by small sketches. Those
sketches relies on a notion of suffix-sampling we introduce. This sampling is
the key idea connecting our streaming tester algorithm to property testers.
[Show abstract][Hide abstract] ABSTRACT: We solve an open problem by constructing quantum walks that not only detect but also find marked vertices in a graph. In the case when the marked set \(M\) consists of a single vertex, the number of steps of the quantum walk is quadratically smaller than the classical hitting time \({{\mathrm{HT}}}(P,M)\) of any reversible random walk \(P\) on the graph. In the case of multiple marked elements, the number of steps is given in terms of a related quantity \({\hbox {HT}}^{+}(P,M)\) which we call extended hitting time. Our approach is new, simpler and more general than previous ones. We introduce a notion of interpolation between the random walk \(P\) and the absorbing walk \(P'\) , whose marked states are absorbing. Then our quantum walk is simply the quantum analogue of this interpolation. Contrary to previous approaches, our results remain valid when the random walk \(P\) is not state-transitive. We also provide algorithms in the cases when only approximations or bounds on parameters \(p_M\) (the probability of picking a marked vertex from the stationary distribution) and \({\hbox {HT}}^{+}(P,M)\) are known.
[Show abstract][Hide abstract] ABSTRACT: We consider the randomized decision tree complexity of the recursive
3-majority function. We prove a lower bound of $(1/2-\delta) \cdot 2.57143^h$
for the two-sided-error randomized decision tree complexity of evaluating
height $h$ formulae with error $\delta \in [0,1/2)$. This improves the lower
bound of $(1-2\delta)(7/3)^h$ given by Jayram, Kumar, and Sivakumar (STOC'03),
and the one of $(1-2\delta) \cdot 2.55^h$ given by Leonardos (ICALP'13).
Second, we improve the upper bound by giving a new zero-error randomized
decision tree algorithm that has complexity at most $(1.007) \cdot 2.64944^h$.
The previous best known algorithm achieved complexity $(1.004) \cdot
2.65622^h$. The new lower bound follows from a better analysis of the base case
of the recursion of Jayram et al. The new algorithm uses a novel "interleaving"
of two recursive algorithms.
Random Structures and Algorithms 09/2013; DOI:10.1007/978-3-642-22006-7_27 · 0.92 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We study the complexity of quantum query algorithms that make p queries in
parallel in each timestep. This model is motivated by the fact that decoherence
times of qubits are typically small, so it makes sense to parallelize quantum
algorithms as much as possible. We show tight bounds for a number of problems,
specifically Theta((n/p)^{2/3}) p-parallel queries for element distinctness and
Theta((n/p)^{k/(k+1)} for k-SUM. Our upper bounds are obtained by parallelized
quantum walk algorithms, and our lower bounds are based on a relatively small
modification of the adversary lower bound method, combined with recent results
of Belovs et al. on learning graphs. We also prove some general bounds, in
particular that quantum and classical p-parallel complexity are polynomially
related for all total functions when p is not too large.
[Show abstract][Hide abstract] ABSTRACT: This work revisits the study of streaming algorithms where both input and
output are data streams. While streaming algorithms with multiple streams have
been studied before, such as in the context of sorting, most assumed very
nonrestrictive models and thus often had weak lower bounds. For this reason, we
consider data streams with restricted access, such as read-only and write-only
streams, as opposed to read-write streams. We also require streams to be
processed in one direction only, and forbid the use of any other external
streams. For read-write streams, we introduce a new complexity measure, the
expansion, that is the ratio between the maximal size of the stream during the
computation and the input size. We first study the problem of reversing a
stream of length n in our models, and give several tight bounds. In the
read-only and write-only model, we show that p-pass algorithms need memory
space {\Theta}(n/p). But if one of the stream is read-write, then the
complexity falls to {\Theta}(n/p^2) (with some ad- ditional restrictions for
the lower bound), and to polylog(n) when p = O(log n) if both streams are
read-write. We then study the problem of sorting and give several algorithms
with small expansion. Our main sorting algorithm is randomized and has constant
expansion, whereas previously known algorithms (without additional external
streams) had linear expansion.
[Show abstract][Hide abstract] ABSTRACT: We present two quantum walk algorithms for 3-Distinctness. Both algorithms have time complexity $\tilde{O}(n^{5/7})$, improving the previous $\tilde{O}(n^{3/4})$ and matching the best known upper bound for query complexity (obtained via learning graphs) up to log factors. The first algorithm is based on a connection between quantum walks and electric networks. The second algorithm uses an extension of the quantum walk search framework that facilitates quantum walks with nested updates.
Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I; 07/2013
[Show abstract][Hide abstract] ABSTRACT: Model checking and testing are two areas with a similar goal: to verify that
a system satisfies a property. They start with different hypothesis on the
systems and develop many techniques with different notions of approximation,
when an exact verification may be computationally too hard. We present some
notions of approximation with their logic and statistics backgrounds, which
yield several techniques for model checking and testing: Bounded Model
Checking, Approximate Model Checking, Approximate Black-Box Checking,
Approximate Model-based Testing and Approximate Probabilistic Model Checking.
All these methods guarantee some quality and efficiency of the verification.
[Show abstract][Hide abstract] ABSTRACT: We present an extension to the quantum walk search framework that facilitates
quantum walks with nested updates. We apply it to give a quantum walk algorithm
for 3-Distinctness with query complexity ~O(n^{5/7}), matching the best known
upper bound (obtained via learning graphs) up to log factors. Furthermore, our
algorithm has time complexity ~O(n^{5/7}), improving the previous ~O(n^{3/4}).
[Show abstract][Hide abstract] ABSTRACT: We develop a new framework that extends the quantum walk framework of
Magniez, Nayak, Roland, and Santha, by utilizing the idea of quantum data
structures to construct an efficient method of nesting quantum walks.
Surprisingly, only classical data structures were considered before for
searching via quantum walks.
The recently proposed learning graph framework of Belovs has yielded improved
upper bounds for several problems, including triangle finding and more general
subgraph detection. We exhibit the power of our framework by giving a simple
explicit constructions that reproduce both the $O(n^{35/27})$ and $O(n^{9/7})$
learning graph upper bounds (up to logarithmic factors) for triangle finding,
and discuss how other known upper bounds in the original learning graph
framework can be converted to algorithms in our framework. We hope that the
ease of use of this framework will lead to the discovery of new upper bounds.
[Show abstract][Hide abstract] ABSTRACT: We show that the quantum query complexity of detecting if an $n$-vertex graph
contains a triangle is $O(n^{9/7})$. This improves the previous best algorithm
of Belovs making $O(n^{35/27})$ queries. For the problem of determining if an
operation $\circ : S \times S \rightarrow S$ is associative, we give an
algorithm making $O(|S|^{10/7})$ queries, the first improvement to the trivial
$O(|S|^{3/2})$ application of Grover search.
Our algorithms are designed using the learning graph framework of Belovs. We
give a family of algorithms for detecting constant-sized subgraphs, which can
possibly be directed and colored. These algorithms are designed in a simple
high-level language; our main theorem shows how this high-level language can be
compiled as a learning graph and gives the resulting complexity.
The key idea to our improvements is to allow more freedom in the parameters
of the database kept by the algorithm. As in our previous work, the edge slots
maintained in the database are specified by a graph whose edges are the union
of regular bipartite graphs, the overall structure of which mimics that of the
graph of the certificate. By allowing these bipartite graphs to be unbalanced
and of variable degree we obtain better algorithms.
[Show abstract][Hide abstract] ABSTRACT: This work is in the line of designing efficient checkers for testing the
reliability of some massive data structures. Given a sequential access to the
insert/extract operations on such a structure, one would like to decide, a
posteriori only, if it corresponds to the evolution of a reliable structure. In
a context of massive data, one would like to minimize both the amount of
reliable memory of the checker and the number of passes on the sequence of
operations. Chu, Kannan and McGregor initiated the study of checking priority
queues in this setting. They showed that use of timestamps allows to check a
priority queue with a single pass and memory space O(N^(1/2)), up to a
polylogarithmic factor. Later, Chakrabarti, Cormode, Kondapally and McGregor
removed the use of timestamps, and proved that more passes do not help. We show
that, even in the presence of timestamps, more passes do not help, solving a
previously open problem. On the other hand, we show that a second pass, but in
reverse direction, shrinks the memory space to O((log N)^2), extending a
phenomenon the first time observed by Magniez, Mathieu and Nayak for checking
well-parenthesized expressions.
[Show abstract][Hide abstract] ABSTRACT: The quantum query complexity of Boolean matrix multiplication is typically
studied as a function of the matrix dimension, n, as well as the number of 1s
in the output, \ell. We prove an upper bound of O (n\sqrt{\ell}) for all values
of \ell. This is an improvement over previous algorithms for all values of
\ell. On the other hand, we show that for any \eps < 1 and any \ell <= \eps
n^2, there is an \Omega(n\sqrt{\ell}) lower bound for this problem, showing
that our algorithm is essentially tight.
We first reduce Boolean matrix multiplication to several instances of graph
collision. We then provide an algorithm that takes advantage of the fact that
the underlying graph in all of our instances is very dense to find all graph
collisions efficiently.
[Show abstract][Hide abstract] ABSTRACT: We present three semi-streaming algorithms for Maximum Bipartite Matching
with one and two passes. Our one-pass semi-streaming algorithm is deterministic
and returns a matching of size at least $1/2+0.005$ times the optimal matching
size in expectation, assuming that edges arrive one by one in (uniform) random
order. Our first two-pass algorithm is randomized and returns a matching of
size at least $1/2+0.019$ times the optimal matching size in expectation (over
its internal random coin flips) for any arrival order. These two algorithms
apply the simple Greedy matching algorithm several times on carefully chosen
subgraphs as a subroutine. Furthermore, we present a two-pass deterministic
algorithm for any arrival order returning a matching of size at least
$1/2+0.019$ times the optimal matching size. This algorithm is built on ideas
from the computation of semi-matchings.
[Show abstract][Hide abstract] ABSTRACT: Let $H$ be a fixed $k$-vertex graph with $m$ edges and minimum degree $d >0$.
We use the learning graph framework of Belovs to show that the bounded-error
quantum query complexity of determining if an $n$-vertex graph contains $H$ as
a subgraph is $O(n^{2-2/k-t})$, where $ t = \max{\frac{k^2-
2(m+1)}{k(k+1)(m+1)}, \frac{2k - d - 3}{k(d+1)(m-d+2)}}$. The previous best
algorithm of Magniez et al. had complexity $\widetilde O(n^{2-2/k})$.
[Show abstract][Hide abstract] ABSTRACT: We extend the study of the complexity of finding an $\eps$-approximate Nash
equilibrium in congestion games from the case of positive delay functions to
delays of arbitrary sign. We first prove that in symmetric games with
$\alpha$-bounded jump the $\eps$-Nash dynamic converges in polynomial time when
all delay functions are negative, similarly to the case of positive delays. We
then establish a hardness result for symmetric games with $\alpha$-bounded jump
and with arbitrary delay functions: in that case finding an $\eps$-Nash
equilibrium becomes $\PLS$-complete.
[Show abstract][Hide abstract] ABSTRACT: We study the problem of validating XML documents of size $N$ against general
DTDs in the context of streaming algorithms. The starting point of this work is
a well-known space lower bound. There are XML documents and DTDs for which
$p$-pass streaming algorithms require $\Omega(N/p)$ space.
We show that when allowing access to external memory, there is a
deterministic streaming algorithm that solves this problem with memory space
$O(\log^2 N)$, a constant number of auxiliary read/write streams, and $O(\log
N)$ total number of passes on the XML document and auxiliary streams.
An important intermediate step of this algorithm is the computation of the
First-Child-Next-Sibling (FCNS) encoding of the initial XML document in a
streaming fashion. We study this problem independently, and we also provide
memory efficient streaming algorithms for decoding an XML document given in its
FCNS encoding.
Furthermore, validating XML documents encoding binary trees in the usual
streaming model without external memory can be done with sublinear memory.
There is a one-pass algorithm using $O(\sqrt{N \log N})$ space, and a
bidirectional two-pass algorithm using $O(\log^2 N)$ space performing this
task.
ACM Transactions on Database Systems 12/2010; 38(4). DOI:10.1145/2274576.2274581 · 0.68 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We solve an open problem by constructing quantum walks that not only detect but also find marked vertices in a graph. The number of steps of the quantum walk is quadratically smaller than the classical hitting time of any reversible random walk $P$ on the graph. Our approach is new, simpler and more general than previous ones. We introduce a notion of interpolation between the walk $P$ and the absorbing walk $P'$, whose marked states are absorbing. Then our quantum walk is simply the quantum analogue of the interpolation. Contrary to previous approaches, our results remain valid when the random walk $P$ is not state-transitive, and in the presence of multiple marked vertices. As a consequence we make a progress on an open problem related to the spatial search on the 2D-grid. Comment: 15 pages
[Show abstract][Hide abstract] ABSTRACT: Motivated by a concrete problem and with the goal of understanding the relationship between the complexity of streaming algorithms and the computational complexity of formal languages, we investigate the problem Dyck(s) of checking matching parentheses, with s different types of parenthesis. We present a one-pass randomized streaming algorithm for Dyck(2) with space O(√ n log(n)) bits, time per letter polylog(n), and one-sided error. We prove that this one-pass algorithm is optimal, up to a log(n) factor, even when two-sided error is allowed, and conjecture that a similar bound holds for any constant number of passes over the input. Surprisingly, the space requirement shrinks drastically if we have access to the input stream "in reverse". We present a two-pass randomized streaming algorithm for Dyck(2) with space O((log n)2), time polylog(n) and one-sided error, where the second pass is in the reverse direction. Both algorithms can be extended to Dyck(s) since this problem is reducible to Dyck(2) for a suitable notion of reduction in the streaming model. Except for an extra O(√ log(s)) multiplicative overhead in the space required in the one-pass algorithm, the resource requirements are of the same order. For the lower bound, we exhibit hard instances Ascension(m) of Dyck(2) with length Θ(mn). We embed these in what we call a "one-pass" communication problem with 2m-players, where m=~O(n). To establish the hardness of Ascension(m), we prove a direct sum result by following the "information cost" approach, but with a few twists. Indeed, we play a subtle game between public and private coins for Mountain, which corresponds to a primitive instance Ascension(1). This mixture between public and private coins for m results from a balancing act between the direct sum result and a combinatorial lower bound for m.
Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010; 01/2010
[Show abstract][Hide abstract] ABSTRACT: We study the complexity of validating XML documents against any given DTD in
the context of streaming algorithms with external memory. We design a
deterministic algorithm that solves this problem with memory space $O(\log^2
N)$, a constant number of auxiliary read/write streams, and $O(\log N)$ total
number of passes on the XML document of size $N$ and auxiliary streams.
An important intermediate step is the memory-efficient computation of the
FCNS encoding of the initial XML document. Then, validity can already be
decided in one-pass with memory space $O(\sqrt{N\log N})$, and no auxiliary
streams. A second but reverse pass makes the memory space collapse to $O(\log^2
N)$.
This suggests a systematic use of the FCNS encoding for large XML documents,
since, without this encoding, there are DTDs against which validating XML
documents requires memory space $\Omega(N/p)$ for any $p$-pass streaming
algorithm without auxiliary streams, even if randomization is allowed.
Last, for the special case of validating XML documents encoding binary trees,
we give a deterministic one-pass algorithm with memory space $O(\sqrt{N})$, and
prove its optimality, up to a multiplicative constant, even if randomization is
allowed.