## About

44

Publications

1,694

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

694

Citations

## Publications

Publications (44)

Generalizing work of K\"unnemann, Paturi, and Schneider [ICALP 2017], we study a wide class of high-dimensional dynamic programming (DP) problems in which one must find the shortest path between two points in a high-dimensional grid given a tensor of transition costs between nodes in the grid. This captures many classical problems which are solved...

In modern machine learning, inner product attention computation is a fundamental task for training large language models such as Transformer, GPT-1, BERT, GPT-2, GPT-3 and ChatGPT. Formally, in this problem, one is given as input three matrices $Q, K, V \in [-B,B]^{n \times d}$, and the goal is to construct the matrix $\mathrm{Att}(Q,K,V) := \mathr...

Over the last decade, deep neural networks have transformed our society, and they are already widely applied in various machine learning applications. State-of-art deep neural networks are becoming larger in size every year to deliver increasing model accuracy, and as a result, model training consumes substantial computing resources and will only c...

We give algorithms with lower arithmetic operation counts for both the Walsh-Hadamard Transform (WHT) and the Discrete Fourier Transform (DFT) on inputs of power-of-2 size $N$. For the WHT, our new algorithm has an operation count of $\frac{23}{24}N \log N + O(N)$. To our knowledge, this gives the first improvement on the $N \log N$ operation count...

We give new, smaller constructions of constant-depth linear circuits for computing any matrix which is the Kronecker power of a fixed matrix. A standard argument (e.g., the mixed product property of Kronecker products, or a generalization of the Fast Walsh-Hadamard transform) shows that any such $N \times N$ matrix has a depth-2 circuit of size $O(...

We use lookup tables to design faster algorithms for important algebraic problems over finite fields. These faster algorithms, which only use arithmetic operations and lookup table operations, may help to explain the difficulty of determining the complexities of these important problems. Our results over a constant-sized finite field are as follows...

For any real numbers $B \ge 1$ and $\delta \in (0, 1)$ and function $f: [0, B] \rightarrow \mathbb{R}$, let $d_{B; \delta} (f) \in \mathbb{Z}_{> 0}$ denote the minimum degree of a polynomial $p(x)$ satisfying $\sup_{x \in [0, B]} \big| p(x) - f(x) \big| < \delta$. In this paper, we provide precise asymptotics for $d_{B; \delta} (e^{-x})$ and $d_{B;...

We design the first efficient sensitivity oracles and dynamic algorithms for a variety of parameterized problems. Our main approach is to modify the algebraic coding technique from static parameterized algorithm design, which had not previously been used in a dynamic context. We particularly build off of the `extensor coding' method of Brand, Dell...

For a matrix $M$ and a positive integer $r$, the rank $r$ rigidity of $M$ is the smallest number of entries of $M$ which one must change to make its rank at most $r$. There are many known applications of rigidity lower bounds to a variety of areas in complexity theory, but fewer known applications of rigidity upper bounds. In this paper, we use rig...

For a function $\mathsf{K} : \mathbb{R}^{d} \times \mathbb{R}^{d} \to \mathbb{R}_{\geq 0}$, and a set $P = \{ x_1, \ldots, x_n\} \subset \mathbb{R}^d$ of $n$ points, the $\mathsf{K}$ graph $G_P$ of $P$ is the complete graph on $n$ nodes where the weight between nodes $i$ and $j$ is given by $\mathsf{K}(x_i, x_j)$. In this paper, we initiate the stu...

The complexity of matrix multiplication is measured in terms of $\omega$, the smallest real number such that two $n\times n$ matrices can be multiplied using $O(n^{\omega+\epsilon})$ field operations for all $\epsilon>0$; the best bound until now is $\omega<2.37287$ [Le Gall'14]. All bounds on $\omega$ since 1986 have been obtained using the so-cal...

Fixed-parameter algorithms and kernelization are two powerful methods to solve NP-hard problems. Yet so far those algorithms have been largely restricted to static inputs. In this article, we provide fixed-parameter algorithms and kernelizations for fundamental NP-hard problems with dynamic inputs. We consider a variety of parameterized graph and h...

Josh Alman and Virginia Vassilevska Williams. A graph G on n nodes is an Orthogonal Vectors (OV) graph of dimension d if there are vectors v1, . . ., vn ∈ {0, 1}d such that nodes i and j are adjacent in G if and only if hvi, vji = 0 over Z. In this paper, we study a number of basic graph algorithm problems, except where one is given as input the ve...

In predicate encryption for a function f, an authority can create ciphertexts and secret keys which are associated with ‘attributes’. A user with decryption key \(K_y\) corresponding to attribute y can decrypt a ciphertext \(CT_x\) corresponding to a message m and attribute x if and only if \(f(x,y)=0\). Furthermore, the attribute x remains hidden...

In this paper, we present a new algorithm for maintaining linear sketches in turnstile streams with faster update time. As an application, we show that $\log n$ \texttt{Count} sketches or \texttt{CountMin} sketches with a constant number of columns (i.e., buckets) can be implicitly maintained in \emph{worst-case} $O(\log^{0.582} n)$ update time usi...

In a recent work, Alman and Vassilevska Williams [FOCS 2018, arXiv:1810.08671 [cs.CC]] proved limitations on designing matrix multiplication algorithms using the Galactic method applied to many tensors of interest, including the family of Coppersmith-Winograd tensors. In this note, we extend all their lower bounds to the more powerful Universal met...

We study the known techniques for designing Matrix Multiplication algorithms. The two main approaches are the Laser method of Strassen, and the Group theoretic approach of Cohn and Umans. We define a generalization based on zeroing outs which subsumes these two approaches, which we call the Solar method, and an even more general method based on mon...

The Light Bulb Problem is one of the most basic problems in data analysis. One is given as input $n$ vectors in $\{-1,1\}^d$, which are all independently and uniformly random, except for a planted pair of vectors with inner product at least $\rho \cdot d$ for some constant $\rho > 0$. The task is to find the planted pair. The most straightforward a...

2018 IEEE. We study the known techniques for designing Matrix Multiplication algorithms. The two main approaches are the Laser method of Strassen, and the Group theoretic approach of Cohn and Umans. We define a generalization based on zeroing outs which subsumes these two approaches, which we call the Solar method, and an even more general method b...

In this work, we introduce an online model for communication complexity. Analogous to how online algorithms receive their input piece-by-piece, our model presents one of the players, Bob, his input piece-by-piece, and has the players Alice and Bob cooperate to compute a result each time before the next piece is revealed to Bob. This model has a clo...

We consider the techniques behind the current best algorithms for matrix multiplication. Our results are threefold. (1) We provide a unifying framework, showing that all known matrix multiplication running times since 1986 can be achieved from a single very natural tensor - the structural tensor $T_q$ of addition modulo an integer $q$. (2) We show...

Fixed-parameter algorithms and kernelization are two powerful methods to solve $\mathsf{NP}$-hard problems. Yet, so far those algorithms have been largely restricted to static inputs. In this paper we provide fixed-parameter algorithms and kernelizations for fundamental $\mathsf{NP}$-hard problems with dynamic inputs. We consider a variety of param...

We consider a notion of probabilistic rank and probabilistic sign-rank of a matrix, which measure the extent to which a matrix can be probabilistically represented by low-rank matrices. We demonstrate several connections with matrix rigidity, communication complexity, and circuit lower bounds. The most interesting outcomes are:
The Walsh-Hadamard T...

In this work, we introduce an online model for communication complexity. Analogous to how online algorithms receive their input piece-by-piece, our model presents one of the players Bob his input piece-by-piece, and has the players Alice and Bob cooperate to compute a result it presents Bob with the next piece. This model has a closer and more natu...

We consider a notion of probabilistic rank and probabilistic sign-rank of a matrix, which measures the extent to which a matrix can be probabilistically represented by low-rank matrices. We demonstrate several connections with matrix rigidity, communication complexity, and circuit lower bounds, including: The Walsh-Hadamard Transform is Not Very Ri...

We design new polynomials for representing threshold functions in three different regimes: probabilistic polynomials of low degree, which need far less randomness than previous constructions, polynomial threshold functions (PTFs) with "nice" threshold behavior and degree almost as low as the probabilistic polynomials, and a new notion of probabilis...

In this paper, we undertake a systematic study of recurrences x_{m+n}x_{m} =
P(x_{m+1}, ..., x_{m+n-1}) which exhibit the Laurent phenomenon. Some of the
most famous among these sequences come from the Somos and the Gale-Robinson
recurrences. Our approach is based on finding period 1 seeds of Laurent
phenomenon algebras of Lam-Pylyavskyy. We comple...

We show how to compute any symmetric Boolean function on $n$ variables over
any field (as well as the integers) with a probabilistic polynomial of degree
$O(\sqrt{n \log(1/\epsilon)})$ and error at most $\epsilon$. The degree
dependence on $n$ and $\epsilon$ is optimal, matching a lower bound of Razborov
(1987) and Smolensky (1987) for the MAJORITY...

Following de Verdière–Gitler–Vertigan and Curtis–Ingerman–Morrow, we prove a host of new results on circular planar electrical networks. We first construct a poset of electrical networks with n boundary vertices, and prove that it is graded by number of edges of critical representatives. We then answer various enumerative questions related to , ada...

Curtis-Ingerman-Morrow characterize response matrices for circular planar
electrical networks as symmetric square matrices with row sums zero and
non-negative circular minors. In this paper, we study this positivity
phenomenon more closely, from both algebraic and combinatorial perspectives.
Extending work of Postnikov, we introduce electrical posi...

Following de Verdi\`{e}re-Gitler-Vertigan and Curtis-Ingerman-Morrow, we
prove a host of new results on circular planar electrical networks. We
introduce a poset EP_{n} of electrical networks with n boundary vertices,
giving two equivalent characterizations, one combinatorial and the other
topological. We then investigate various properties of the...