## About

270

Publications

25,314

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

13,954

Citations

Citations since 2016

## Publications

Publications (270)

We describe an algorithm for solving an important geometric problem arising in computer-aided manufacturing. When cutting away a region from a solid piece of material—such as steel, wood, ceramics, or plastic—using a rough tool in a milling machine, sharp convex corners of the region cannot be done properly, but have to be left for finer tools that...

Sketching is an important tool for dealing with high-dimensional vectors that are sparse (or well-approximated by a sparse vector), especially useful in distributed, parallel, and streaming settings. It is known that sketches can be made differentially private by adding noise according to the sensitivity of the sketch, and this has been used in pri...

Simple tabulation hashing dates back to Zobrist in 1970 and is defined as follows: Each key is viewed as $c$ characters from some alphabet $\Sigma$, we have $c$ fully random hash functions $h_0, \ldots, h_{c - 1} \colon \Sigma \to \{0, \ldots, 2^l - 1\}$, and a key $x = (x_0, \ldots, x_{c - 1})$ is hashed to $h(x) = h_0(x_0) \oplus \ldots \oplus h_...

We present a dynamic algorithm for maintaining the connected and 2-edge-connected components in an undirected graph subject to edge deletions. The algorithm is Monte-Carlo randomized and processes any sequence of edge deletions in $O(m + n \operatorname{polylog} n)$ total time. Interspersed with the deletions, it can answer queries to whether any t...

We consider the numerical taxonomy problem of fitting a positive distance function ${D:{S\choose 2}\rightarrow \mathbb R_{>0}}$ by a tree metric. We want a tree $T$ with positive edge weights and including $S$ among the vertices so that their distances in $T$ match those in $D$. A nice application is in evolutionary biology where the tree $T$ aims...

In dynamic load balancing, we wish to distribute balls into bins in an environment where both balls and bins can be added and removed. We want to minimize the maximum load of any bin but we also want to minimize the number of balls and bins affected when adding or removing a ball or a bin. We want a hashing-style solution where we given the ID of a...

We say that a random integer variable $X$ is \emph{monotone} if the modulus of the characteristic function of $X$ is decreasing on $[0,\pi]$. This is the case for many commonly encountered variables, e.g., Bernoulli, Poisson and geometric random variables. In this note, we provide estimates for the probability that the sum of independent monotone i...

Locality-sensitive hashing (LSH), introduced by Indyk and Motwani in STOC ’98, has been an extremely influential framework for nearest neighbor search in high-dimensional data sets. While theoretical work has focused on the approximate nearest neighbor problem, in practice LSH data structures with suitably chosen parameters are used to solve the ex...

The classic way of computing a $k$-universal hash function is to use a random degree-$(k-1)$ polynomial over a prime field $\mathbb Z_p$. For a fast computation of the polynomial, the prime $p$ is often chosen as a Mersenne prime $p=2^b-1$. In this paper, we show that there are other nice advantages to using Mersenne primes. Our view is that the ou...

We say that a simple, closed curve γ in the plane has bounded convex curvature if for every point x on γ, there is an open unit disk Ux and εx>0 such that x∈∂Ux and Bεx(x)∩Ux⊂Int γ. We prove that the interior of every curve of bounded convex curvature contains an open unit disk.

To get estimators that work within a certain error bound with high probability, a common strategy is to design one that works with constant probability, and then boost the probability using independent repetitions. Important examples of this approach are small space algorithms for estimating the number of distinct elements in a stream, or estimatin...

Recently, Kawarabayashi and Thorup presented the first deterministic edge-connectivity recognition algorithm in near-linear time. A crucial step in their algorithm uses the existence of vertex subsets of a simple graph G on n vertices whose contractions leave a multigraph with Õ(n∕δ) vertices and Õ(n) edges that preserves all non-trivial min-cuts o...

Each vertex of an arbitrary simple graph on $n$ vertices chooses $k$ random incident edges. What is the expected number of edges in the original graph that connect different connected components of the sampled subgraph? We prove that the answer is $O(n/k)$, when $k\ge c\log n$, for some large enough $c$. We conjecture that the same holds for smalle...

We say that a simple, closed curve $\gamma$ in the plane has bounded convex curvature if for every point $x$ on $\gamma$, there is an open unit disk $U_x$ and $\varepsilon_x>0$ such that $x\in\partial U_x$ and $B_{\varepsilon_x}(x)\cap U_x\subset\text{Int}\;\gamma$. We prove that the interior of every curve of bounded convex curvature contains an o...

We provide a simple new randomized contraction approach to the global minimum cut problem for simple undirected graphs. The contractions exploit 2-out edge sampling from each vertex rather than the standard uniform edge sampling. We demonstrate the power of our new approach by obtaining better algorithms for sequential, distributed, and parallel mo...

We develop a new algorithm for the turnstile heavy hitters problem in general turnstile streams, the EXPANDERSKETCH, which finds the approximate top-k items in a universe of size n using the same asymptotic O(k log n) words of memory and O(log n) update time as the COUNTMIN and COUNTSKETCH, but requiring only O(k poly(log n)) time to answer queries...

Consider collections $\mathcal{A}$ and $\mathcal{B}$ of red and blue sets, respectively. Bichromatic Closest Pair is the problem of finding a pair from $\mathcal{A}\times \mathcal{B}$ that has similarity higher than a given threshold according to some similarity measure. Our focus here is the classic Jaccard similarity $|\textbf{a}\cap \textbf{b}|/...

Previous work on tabulation hashing of P\v{a}tra\c{s}cu and Thorup from STOC'11 on simple tabulation and from SODA'13 on twisted tabulation offered Chernoff-style concentration bounds on hash based sums, but under some quite severe restrictions on the expected values of these sums. More precisely, the basic idea in tabulation hashing is to view a k...

We describe a way of assigning labels to the vertices of any undirected graph on up to n vertices, each composed of n/2 + O(1) bits, such that given the labels of two vertices, and no other information regarding the graph, it is possible to decide whether or not the vertices are adjacent in the graph. This is optimal, up to an additive constant, an...

We present a deterministic algorithm that computes the edge-connectivity of a graph in near-linear time. This is for a simple undirected unweighted graph G with n vertices and m edges. This is the first o(mn) time deterministic algorithm for the problem. Our algorithm is easily extended to find a concrete minimum edge-cut. In fact, we can construct...

Locality-sensitive hashing (LSH), introduced by Indyk and Motwani in STOC '98, has been an extremely influential framework for nearest neighbor search in high-dimensional data sets. While theoretical work has focused on the approximate nearest neighbor problems, in practice LSH data structures with suitably chosen parameters are used to solve the e...

We consider the hashing of a set $X\subseteq U$ with $|X|=m$ using a simple tabulation hash function $h:U\to [n]=\{0,\dots,n-1\}$ and analyse the number of non-empty bins, that is, the size of $h(X)$. We show that the expected size of $h(X)$ matches that with fully random hashing to within low-order terms. We also provide concentration bounds. The...

Recently, Kawarabayashi and Thorup presented the first deterministic edge-connectivity recognition algorithm in near-linear time. A crucial step in their algorithm uses the existence of vertex subsets of a simple graph $G$ on $n$ vertices whose contractions leave a multigraph with $\tilde{O}(n/\delta)$ vertices and $\tilde{O}(n)$ edges that preserv...

When deciding where to place access points in a wireless network, it is useful to model the signal propagation loss between a proposed antenna location and the areas it may cover. The indoor dominant path (IDP) model, introduced by Wölfle et al., is shown in the literature to have good validation and generalization error, is faster to compute than...

We consider very natural ”fence enclosure” problems studied by Capoyleas, Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a set S of n points in the plane, we aim at finding a set of closed curves such that (1) each point is enclosed by a curve and (2) the total length of the curves is minimized. We consider two main va...

Suppose that we are to place $m$ balls into $n$ bins sequentially using the $d$-choice paradigm: For each ball we are given a choice of $d$ bins, according to $d$ hash functions $h_1,\dots,h_d$ and we place the ball in the least loaded of these bins breaking ties arbitrarily. Our interest is in the number of balls in the fullest bin after all $m$ b...

We consider very natural "fence enclosure" problems studied by Capoyleas, Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a set $S$ of $n$ points in the plane, we aim at finding a set of closed curves such that (1) each point is enclosed by a curve and (2) the total length of the curves is minimized. We consider two mai...

We present a deterministic incremental algorithm for exactly maintaining the size of a minimum cut with O(log³n log log²n) amortized time per edge insertion and O(1) query time. This result partially answers an open question posed by Thorup (2007). It also stays in sharp contrast to a polynomial conditional lower bound for the fully dynamic weighte...

Hashing is a basic tool for dimensionality reduction employed in several aspects of machine learning. However, the perfomance analysis is often carried out under the abstract assumption that a truly random unit cost hash function is used, without concern for which concrete hash function is employed. The concrete hash function may work fine on suffi...

We present a deterministic fully-dynamic data structure for maintaining information about the bridges in a graph. We support updates in $\tilde{O}((\log n)^2)$ amortized time, and can find a bridge in the component of any given vertex, or a bridge separating any two given vertices, in $O(\log n / \log \log n)$ worst case time. Our bounds match the...

Randomized algorithms are often enjoyed for their simplicity, but the hash functions employed to yield the desired probabilistic guarantees are often too complicated to be practical. Here, we survey recent results on how simple hashing schemes based on tabulation provide unexpectedly strong guarantees.
Simple tabulation hashing dates back to Zobris...

We consider the Similarity Sketching problem: Given a universe $[u]= \{0,\ldots,u-1\}$ we want a random function $S$ mapping subsets $A\subseteq [u]$ into vectors $S(A)$ of size $t$, such that similarity is preserved. More precisely: Given sets $A,B\subseteq [u]$, define $X_i=[S(A)[i]= S(B)[i]]$ and $X=\sum_{i\in [t]}X_i$. We want to have $E[X]=t\c...

Backwards analysis, first popularized by Seidel, is often the simplest most elegant way of analyzing a randomized algorithm. It applies to incremental algorithms where elements are added incrementally, following some random permutation, e.g., incremental Delauney triangulation of a pointset, where points are added one by one, and where we always ma...

We present a deterministic incremental algorithm for \textit{exactly} maintaining the size of a minimum cut with $\widetilde{O}(1)$ amortized time per edge insertion and $O(1)$ query time. This result partially answers an open question posed by Thorup [Combinatorica 2007]. It also stays in sharp contrast to a polynomial conditional lower-bound for...

In turnstile $\ell_p$ $\varepsilon$-heavy hitters, one maintains a high-dimensional $x\in\mathbb{R}^n$ subject to $\texttt{update}(i,\Delta)$ causing $x_i\leftarrow x_i + \Delta$, where $i\in[n]$, $\Delta\in\mathbb{R}$. Upon receiving a query, the goal is to report a small list $L\subset[n]$, $|L| = O(1/\varepsilon^p)$, containing every "heavy hitt...

We describe an algorithm for solving an important geometric problem arising in computer-aided manufacturing. When machining a pocket in a solid piece of material such as steel using a rough tool in a milling machine, sharp convex corners of the pocket cannot be done properly, but have to be left for finer tools that are more expensive to use. We wa...

These lecture notes show that linear probing takes expected constant time if
the hash function is 5-independent. This result was first proved by Pagh et al.
[STOC'07,SICOMP'09]. The simple proof here is essentially taken from [Patrascu
and Thorup ICALP'10]. The lecture is a nice illustration of the use of higher
moments in data structures, and coul...

We present a deterministic dynamic connectivity data structure for undirected
graphs with worst-case update time $O(\sqrt{n}/w^{1/4})$ and constant query
time, where $w = \Omega(\log n)$ is the word size. This bound improves on the
previous best deterministic worst-case algorithm of Frederickson (STOC, 1983)
and Eppstein Galil, Italiano, and Nissen...

We consider the following fundamental problems: (1) Constructing
$k$-independent hash functions with a space-time tradeoff close to Siegel's
lower bound. (2) Constructing representations of unbalanced expander graphs
having small size and allowing fast computation of the neighbor function. It is
not hard to show that these problems are intimately c...

Randomized algorithms are often enjoyed for their simplicity, but the hash
functions employed to yield the desired probabilistic guarantees are often too
complicated to be practical. Here we survey recent results on how simple
hashing schemes based on tabulation provide unexpectedly strong guarantees.
{\em Simple tabulation hashing\/} dates back to...

These notes describe the most efficient hash functions currently known for
hashing integers and strings. These modern hash functions are often an order of
magnitude faster than those presented in standard text books. They are also
simpler to implement, and hence a clear win in practice, but their analysis is
harder. Some of the most practical hash...

In the CONGEST model, a communications network is an undirected graph whose n nodes are processors and whose m edges are the communications links between processors. At any given time step, a message of size O(log n) may be sent by each node to each of its neighbours. We show for the synchronous model: If all nodes start in the same round, and each...

In this paper we propose a hash function for $k$-partitioning a set into bins
so that we get good concentration bounds when combining statistics from
different bins.
To understand this point, suppose we have a fully random hash function
applied to a set $X$ of red and blue balls. We want to estimate the fraction
$f$ of red balls. The idea of MinHas...

We show how to represent a planar digraph in linear space so that distance
queries can be answered in constant time. The data structure can be constructed
in linear time. This representation of reachability is thus optimal in both
time and space, and has optimal construction time. The previous best solution
used $O(n\log n)$ space for constant quer...

We present a deterministic near-linear time algorithm that computes the edge-connectivity and finds a minimum cut for a simple undirected unweighted graph G with n vertices and m edges. This is the first o(mn) time deterministic algorithm for the problem. In near-linear time we can also construct the classic cactus representation of all minimum cut...

A random sampling function Sample:U->{0,1} for a key universe U is a
distinguisher with probability p if for any given assignment of values v(x) to
the keys x in U, including at least one non-zero v(x)!=0, the sampled sum sum{
v(x) | x in U and Sample(x) } is non-zero with probability at least p. Here the
key values may come from any commutative mo...

We present a data structure representing a dynamic set S of w-bit integers on
a w-bit word RAM. With |S|=n and w > log n and space O(n), we support the
following standard operations in O(log n / log w) time:
- insert(x) sets S = S + {x}. - delete(x) sets S = S - {x}. - predecessor(x)
returns max{y in S | y< x}. - successor(x) returns min{y in S | y...

The power of two choices is a classic paradigm used for assigning $m$ balls
to $n$ bins. When placing a ball we pick two bins according to some hash
functions $h_0$ and $h_1$, and place the ball in the least full bin. It was
shown by Azar et al.~[STOC'94] that for $m = O(n)$ with perfectly random hash
functions this scheme yields a maximum load of...

A random hash function $h$ is $\varepsilon$-minwise if for any set $S$,
$|S|=n$, and element $x\in S$, $\Pr[h(x)=\min h(S)]=(1\pm\varepsilon)/n$.
Minwise hash functions with low bias $\varepsilon$ have widespread applications
within similarity estimation.
Hashing from a universe $[u]$, the twisted tabulation hashing of
P\v{a}tra\c{s}cu and Thorup [...

We describe a way of assigning labels to the vertices of any undirected graph
on up to $n$ vertices, each composed of $n/2+O(1)$ bits, such that given the
labels of two vertices, and no other information regarding the graph, it is
possible to decide whether or not the vertices are adjacent in the graph. This
is optimal, up to an additive constant,...

A random hash function h is ε-minwise if for any set S, |S| = n, and element x ∈ S, \(\Pr[h(x)=\min h(S)]=(1\pm\varepsilon )/n\). Minwise hash functions with low bias ε have widespread applications within similarity estimation.
Hashing from a universe [u], the twisted tabulation hashing of Pǎtraşcu and Thorup [SODA’13] makes c = O(1) lookups in tab...

Recognizing 3-colorable graphs is one of the most famous NP-complete problems [Garey, Johnson, and Stockmeyer STOC'74]. The problem of coloring 3-colorable graphs in polynomial time with as few colors as possible has been intensively studied: O(n1/2) colors [Wigderson STOC'82], Õ(n2/5) colors [Blum STOC'89], Õ (n3/8) colors [Blum FOCS'90], O(n1/4)...

Simple tabulation dates back to Zobrist in 1970. Keys are viewed as c
characters from some alphabet A. We initialize c tables h_0, ..., h_{c-1}
mapping characters to random hash values. A key x=(x_0, ..., x_{c-1}) is hashed
to h_0[x_0] xor...xor h_{c-1}[x_{c-1}]. The scheme is extremely fast when the
character hash tables h_i are in cache. Simple t...

Bottom-k sketches are an alternative to k×minwise sketches when using hashing to estimate the similarity of documents represented by shingles (or set similarity in general) in large-scale machine learning. They are faster to compute and have nicer theoretical properties. In the case of k×minwise hashing, the bias introduced by not truly random hash...

Throughout the last decade, extensive deployment of popular intra-domain routing protocols such as open shortest path first
and intermediate system–intermediate system, has drawn an ever increasing attention to Internet traffic engineering. This
paper reviews optimization techniques that have been deployed for managing intra-domain routing in netwo...

We consider bottom-k sampling for a set X, picking a sample Sk(X) consisting of the k elements that are smallest according to a given hash function h. With this sample we can estimate the relative size f=|Y|/|X| of any subset Y as |Sk(X) intersect Y|/k. A standard application is the estimation of the Jaccard similarity f=|A intersect B|/|A union B|...

We survey recent results on parallel repetition theorems for computationally-sound interactive proofs (a.k.a. interactive arguments).

Experts suggest that some pure result-based funding need to be initiated to fund successful research projects. An x-year grant can be based on results from the last x years. This eliminates the issue of unpredictable research from a research foundation perspective. The researcher can at his own risk follow the craziest inspiration, but he or she ha...

We show that linear probing requires 5-independent hash functions for
expected constant-time performance, matching an upper bound of [Pagh et al.
STOC'07]. More precisely, we construct a 4-independent hash functions yielding
expected logarithmic search time.
For (1+{\epsilon})-approximate minwise independence, we show that \Omega(log
1/{\epsilon})-...

We introduce a new tabulation-based hashing scheme called "twisted tabulation". It is essentially as simple and fast as simple tabulation, but has some powerful distributional properties illustrating its promise: (1) If we sample keys with arbitrary probabilities, then with high probability, the number of samples inside any subset is concentrated e...

Distance oracles are data structures that provide fast (possibly approximate) answers to shortest-path and distance queries in graphs. The tradeoff between the space requirements and the query time of distance oracles is of particular interest and the main focus of this paper. Unless stated otherwise, we assume all graphs to be planar and undirecte...

Given a weighted undirected graph, our basic goal is to represent all pairwise distances using much less than quadratic space, such that we can estimate the distance between query vertices in constant time. We will study the inherent trade-off between space of the representation and the stretch (multiplicative approximation disallowing underestimat...

Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Zobrist in 1970 who used it for game playing...

We consider the problem of coloring a 3-colorable graph in polynomial time using as few colors as possible. We present a combinatorial algorithm getting down to (O) over tilde (n(4/11)) colors. This is the first combinatorial improvement of Blum's (O) over tilde (n(3/8)) bound from FOCS'90. Like Blum's algorithm, our new algorithm composes nicely w...

In the framework of Wegman and Carter, a $k$-independent hash function maps any $k$ keys independently. It is known that 5-independent hashing provides good expected performance in applications such as linear probing and second moment estimation for data streams. The classic $5$-independent hash function evaluates a degree 4 polynomial over a prime...

We consider portable software implementations of hash tables with timeouts. The context is a high volume stream of keyed items. When a new item arrives, we want to know if has been seen recently in terms of a fixed lifespan. This problem has numerous applications as a front-end for Internet traffic processing where the key could be a selection of f...

We present a new threshold phenomenon in data structure lower bounds where
slightly reduced update times lead to exploding query times. Consider
incremental connectivity, letting t_u be the time to insert an edge and t_q be
the query time. For t_u = Omega(t_q), the problem is equivalent to the
well-understood union-find problem: InsertEdge(s,t) can...

We consider a the minimum k-way cut problem for unweighted graphs with a size
bound s on the number of cut edges allowed. Thus we seek to remove as few edges
as possible so as to split a graph into k components, or report that this
requires cutting more than s edges. We show that this problem is
fixed-parameter tractable (FPT) in s. More precisely,...

From a high volume stream of weighted items, we want to maintain a generic sample of a certain limited size $k$ that we can later use to estimate the total weight of arbitrary subsets. This is the classic context of on-line reservoir sampling, thinking of the generic sample as a reservoir. We present an efficient reservoir sampling scheme, $\textno...

Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees.
The scheme itself dates back to Zobrist in 1970 who used it for game playing...

We show that linear probing requires 5-independent hash functions for expected constant-time performance, matching an upper bound of [A. Pagh et al., SIAM J. Comput. 39, No. 3, 1107–1120 (2009; Zbl 1192.68204)]. For (1+ϵ)-approximate minwise independence, we show that Ω(lg1 ε)-independent hash functions are required, matching an upper bound of [P....

We describe a simple, but powerful local encoding technique, implying two surprising results: 1. We show how to represent a vector of n values from some alphabet S using ceiling(n * log2 |S|) bits, such that reading or writing any entry takes O(1) time. This demonstrates, for instance, an "equivalence" between decimal and binary computers, and has...

Regular expression matching is a key task (and of- ten computational bottleneck) in a variety of software tools and applications. For instance, the standard grep and sed utilities, scripting languages such as perl, internet trac analysis, XML querying, and protein searching. The basic denition of a regu- lar expression is that we combine characters...

Previously [SODA’04] we devised the fastest known algorithm for 4-universal hashing. The hashing was based on small pre-computed4-universal tables. This led to a five-fold improvement in speed over direct methods based on degree 3 polynomials. In this paper, we show that if the pre-computed tables are made 5-universal, then the hash value becomes 5...

We present two new algorithms for finding optimal strategies for discounted, infinite-horizon, Determinis-tic Markov Decision Processes (DMDP). The first one is an adaptation of an algorithm of Young, Tarjan and Orlin for finding minimum mean weight cycles. It runs in O(mn + n 2 log n) time, where n is the number of vertices (or states) and m is th...

Many data sets occur as unaggregated data sets, where multiple data points are associated with each key. In the aggregate view of the data, the weight of a key is the sum of the weights of data points associated with the key. Examples are measurements of IP packet header streams, distributed data streams produced by events reg- istered by sensor ne...

Regular expression matching is a key task (and often the computational bottleneck) in a variety of widely used software tools
and applications, for instance, the unix
grep and sed commands, scripting languages such as awk and perl, programs for analyzing massive data streams, etc. We show how to solve this ubiquitous task in linear space and O(nm(l...

Linear probing is one of the most popular implementations of dynamic hash tables storing all keys in a single array. When we get a key, we first hash it to a location. Next we probe consecutive locations until the key or an empty location is found. At STOC'07, Pagh et al. presented data sets where the standard implementation of 2-universal hashing...

From a high volume stream of weighted items, we want to maintain a generic
sample of a certain limited size $k$ that we can later use to estimate the
total weight of arbitrary subsets. This is the classic context of on-line
reservoir sampling, thinking of the generic sample as a reservoir. We present
an efficient reservoir sampling scheme, $\varopt...

We present two new algorithms for finding optimal strategies for discounted, infinite-horizon, Determinis- tic Markov Decision Processes (DMDP). The first one is an adaptation of an algorithm of Young, Tarjan and Orlin for finding minimum mean weight cycles. It runs in O(mn + n2 log n) time, where n is the number of vertices (or states) and m is th...

Measurement, collection, and interpretation of network usage data commonly involves multiple stage of sampling and aggregation. Examples include sampling packets, aggregating them into flow statistics at a router, sampling and aggregation of usage records in a network data repository for reporting, query and archiving. Although unbiased estimates o...

We present a simple and fast deterministic algorithm for the minimum k-way cut problem in a capacitated graph, that is, finding a set of edges with minimum total capacity whose removal splits the graph into at least k components. The algorithm packs O(mk3 log n) trees. Each new tree is a minimal spanning tree with respect to the edge utilizations,...

Dynamic shortest path algorithms update the shortest paths to take into ac-count a change in an edge weight. This paper describes a new technique that allows the reduction of heap sizes used by several dynamic shortest path algorithms. For unit weight change, the updates can be done without heaps. These reductions almost always reduce the computati...

From a high volume stream of weighted items, we want to maintain a generic sample of a certain limited size $k$ that we can later use to estimate the total weight of arbitrary subsets. This is the classic context of on-line reservoir sampling, thinking of the generic sample as a reservoir. We present a reservoir sampling scheme providing variance o...

We consider the problem of preprocessing an edge-weighted directed graph G to answer queries that ask for the shortest distance from any given node x to any other node y avoiding an arbitrary failed node or link. We describe an oracle (i.e, a simple data structure) for such queries that can be stored in O(n2 log n) space, and which allows queries t...

## Projects

Project (1)

Two streams of research were developed in parallel.
(1) A theory of denotational models for Pascal-like programming languages based on set-theory, many-sorted algebras and a three-valued predicate calculus. That approach was an alternative to a model based on reflexive domains (by Dana Scott) and continuations. As a tool for defining denotations, syntax and semantics of concrete programming-languages, a metalanguage MetaSoft was proposed.
(2) Given a denotational model (in our sense) of a programming language, one can define sound program-constructors, i.e. constructors which given correct components build correct resulting programs. That approach was based on a Hoare-like logic of total correctness with clean termination.
A follow-up of that project started in 2018 under the name of Denotational Engineering.