# Venkatesh SrinivasanUniversity of Victoria | UVIC · Department of Computer Science

Venkatesh Srinivasan

PhD

## About

119

Publications

10,262

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

1,401

Citations

## Publications

Publications (119)

Authentication plays a critical role in the security of quantum key distribution (QKD) protocols. We propose using Polynomial Hash and its variants for authentication of variable length messages in QKD protocols. Since universal hashing is used not only for authentication in QKD but also in other steps in QKD like error correction and privacy ampli...

Graphlet enumeration is a basic task in graph analysis with many applications. Thus it is important to be able to perform this task within a reasonable amount of time. However, this objective is challenging when the input graph is very large, with millions of nodes and edges. Known solutions are limited in terms of the scale of the graph that they...

Truss decomposition is a popular notion of hierarchical dense substructures in graphs. In a nutshell, k-truss is the largest subgraph in which every edge is contained in at least k triangles. Truss decomposition aims to compute k-trusses for each possible value of k. There are many works that study truss decomposition in deterministic graphs. Howev...

We consider misinformation propagating through a social network and study the problem of its prevention. The goal is to identify a set of $k$ users that need to be convinced to adopt a limiting campaign so as to minimize the number of people that end up adopting the misinformation. This work presents Reverse Prevention Sampling (RPS), an algorithm...

A \emph{simple} $s,t$ path $P$ in a rectangular grid graph $\mathbb{G}$ is a Hamiltonian path from the top-left corner $s$ to the bottom-right corner $t$ such that each \emph{internal} subpath of $P$ with both endpoints $a$ and $b$ on the boundary of $\mathbb{G}$ has the minimum number of bends needed to travel from $a$ to $b$ (i.e., $0$, $1$, or $...

We give a complete structure theorem for 1-complex s, t Hamiltonian paths in rectangular grid graphs. We use the structure theorem to design an algorithm to reconfigure one such path into any other in linear time, making a linear number of switch operations in grid cells.

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for millions of deaths around the world. To help contribute to the understanding of crucial knowledge and to further generate new hypotheses relevant to SARS-CoV-2 and human protein interactions, we make use of the information abundant Biomine probabilistic database and...

Mining dense subgraphs where vertices connect closely with each other is a common task when analyzing graphs. A very popular notion in subgraph analysis is core decomposition. Recently, Esfahani et al. presented a probabilistic core decomposition algorithm based on graph peeling and Central Limit Theorem (CLT) that is capable of handling very large...

The SARS-CoV-2 coronavirus is responsible for millions of deaths around the world. To help contribute to the understanding of crucial knowledge and to further generate new hypotheses relevant to SARS-CoV-2 and human protein interactions, we make use of the information abundant Biomine probabilistic database and extend the experimentally identified...

We study the following reconfiguration problem: given two s, t Hamiltonian paths connecting diagonally opposite corners s and t of a rectangular grid graph G, can we transform one to the other using only local operations in the grid cells? In this work, we introduce the notion of simples, t Hamiltonian paths, and give an algorithm to reconfigure su...

Triangle counting is a major graph problem with several applications in social network analysis, anomaly detection, etc. One of the most popular triangle computational models considered is Edge Streaming in which the edges arrive in the form of a graph stream. We categorize the existing literature into two categories: Fixed Memory (FM) approach, an...

Triangle enumeration is a fundamental task in graph data analysis with many applications. Recently, Park et al. proposed a distributed algorithm, PTE (Pre-partitioned Triangle Enumeration), that, unlike previous works, scales well using multiple high end machines and can handle very large real-world networks. This work presents a serverless impleme...

In this paper, we first give explicit formulas for the number of solutions of unweighted linear congruences with distinct coordinates. Our main tools are properties of Ramanujan sums and of the discrete Fourier transform of arithmetic functions. Then, as an application, we derive an explicit formula for the number of codewords in the Varshamov--Ten...

Universal hash functions, discovered by Carter and Wegman in 1979, are of great importance in computer science with many applications. MMH$^*$ is a well-known $\triangle$-universal hash function family, based on the evaluation of a dot product modulo a prime. In this paper, we introduce a generalization of MMH$^*$, that we call GMMH$^*$, using the...

Let $\Z_n[i]$ be the ring of Gaussian integers modulo a positive integer $n$. Very recently, Camarero and Mart\'{i}nez [IEEE Trans. Inform. Theory, {\bf 62} (2016), 1183--1192], showed that for every prime number $p>5$ such that $p\equiv \pm 5 \pmod{12}$, the Cayley graph $\mathcal{G}_p=\textnormal{Cay}(\Z_p[i], S_2)$, where $S_2$ is the set of uni...

A fundamental challenge in graph mining is the ever-increasing size of datasets. Graph summarization aims to find a compact representation resulting in faster algorithms and reduced storage needs. The flip side of graph summarization is the loss of utility which diminishes its usability. The key questions we address in this paper are: (1)How to sum...

Finding dense components in graphs is of great importance in analyzing the structure of networks. Popular and computationally feasible frameworks for discovering dense subgraphs are core and truss decompositions. Recently, Sariy\"uce et al. introduced nucleus decomposition, a generalization which uses higher-order structures and can reveal interest...

In this work, we consider misinformation propagating through a social network and study the problem of its prevention. In this problem, a "bad" campaign starts propagating from a set of seed nodes in the network and we use the notion of a limiting (or "good") campaign to counteract the effect of misinformation. The goal is to identify a set of k us...

MapReduce is a widely used parallel computing paradigm for the big data realm on the scale of terabytes and higher. The introduction of minimal MapReduce algorithms promised efficiency in load balancing among participating machines by ensuring that partition skew (where some machines end up processing a significantly larger fraction of the input th...

Stack Overflow is the most popular Q&A website among software developers. As a platform for knowledge sharing and acquisition, the questions posted on Stack Overflow usually contain a code snippet. Determining the programming language of a source code file has been considered in the research community; it has been shown that Machine Learning (ML) a...

Recently, Grynkiewicz et al. [{\it Israel J. Math.} {\bf 193} (2013), 359--398], using tools from additive combinatorics and group theory, proved necessary and sufficient conditions under which the linear congruence $a_1x_1+\cdots +a_kx_k\equiv b \pmod{n}$, where $a_1,\ldots,a_k,b,n$ ($n\geq 1$) are arbitrary integers, has a solution $\langle x_1,\...

Truss decomposition is popular for finding dense substructures in graphs. Discovering trusses in deterministic graphs has been widely discussed in the literature. However, with the intrinsic uncertainty in many networks such as social, biological, and communication networks, it is of great importance to study truss decomposition in a probabilistic...

Core decomposition is a popular tool for analyzing the structure of network graphs. For probabilistic graphs the computation comes with several challenges and the state-of-the-art approach is not scalable to large graphs. One of the challenges is to compute tail probabilities of vertex degrees in probabilistic graphs. To address this we employ a sp...

Determining the programming language of a source code file has been considered in the research community; it has been shown that Machine Learning (ML) and Natural Language Processing (NLP) algorithms can be effective in identifying the programming language of source code files. However, determining the programming language of a code snippet or a fe...

Stack Overflow is the most popular Q&A website among software developers. As a platform for knowledge sharing and acquisition, the questions posted in Stack Overflow usually contain a code snippet. Stack Overflow relies on users to properly tag the programming language of a question and it simply assumes that the programming language of the snippet...

In this paper, we first give explicit formulas for the number of solutions of unweighted linear congruences with distinct coordinates. Our main tools are properties of Ramanujan sums and of the discrete Fourier transform of arithmetic functions. Then, as an application, we derive an explicit formula for the number of codewords in the Varshamov–Tene...

The analysis of characteristics of large-scale graphs has shown tremendous benefits in social networks, spam detection, epidemic disease control, analyzing software systems and so on. However, today, processing graph algorithms on massive datasets is not an easy task not only because of the large data volume, but also the complexity of the graph al...

In this work, we consider misinformation propagating through a social network and study the problem of its prevention. In this problem, a "bad" campaign starts propagating from a set of seed nodes in the network and we use the notion of a limiting (or "good") campaign to counteract the effect of misinformation. The goal is then to identify a subset...

Universal hashing, discovered by Carter and Wegman in 1979, has many important applications in computer science. MMH$^*$, which was shown to be $\Delta$-universal by Halevi and Krawczyk in 1997, is a well-known universal hash function family. We introduce a variant of MMH$^*$, that we call GRDH, where we use an arbitrary integer $n>1$ instead of pr...

Often graph theory is used to model and analyze different behaviors of networks including social networks. Nowadays, social networks have become very popular and social network providers try to expand their networks by encouraging people to stay engaged and active. Studies show that engagement and activities of people in social networks influence e...

The Markov Modulated Poisson Process (MMPP) has been extensively studied in random process theory and widely applied in various applications involving Poisson arrivals whose rate varies following a Markov process. Despite the rich literature on MMPP, very little is known on its intricate temporal dependence structure. No exact solution is available...

We consider the problem of assigning radii to a given set of points in the plane, such that the resulting set of disks is connected, and the sum of radii is minimized. We prove that the problem is NP-hard in planar weighted graphs if there are upper bounds on the radii and sketch a similar proof for planar point sets. For the case when there are no...

In this paper, using properties of Ramanujan sums and of the discrete Fourier transform of arithmetic functions, we give an explicit formula for the number of solutions of the linear congruence
$a_1x_1+\cdots +a_kx_k\equiv b \pmod{n}$, with $\gcd(x_i,n)=t_i$ ($1\leq i\leq k$), where $a_1,t_1,\ldots,a_k,t_k, b,n$ ($n\geq 1$) are arbitrary integers....

The minimum feedback arc set problem is an NP-hard problem on graphs that seeks a minimum set of arcs which, when removed from the graph, leave it acyclic. In this work, we investigate several approximations for computing a minimum feedback arc set with the goal of comparing the quality of the solutions and the running times. Our investigation is m...

Universal hashing, discovered by Carter and Weg-man in 1979, has many important applications in computer science. As a well known family, one can mention MMH * which was shown to be-universal by Halevi and Krawczyk in 1997. In this paper, we first introduce a variant of MMH * that we call GRDH. Then via a novel approach, namely, connecting the univ...

Let $b,n\in \mathbb{Z}$, $n\geq 1$, and ${\cal D}_1, \ldots, {\cal D}_{\tau(n)}$ be all positive divisors of $n$. For $1\leq l \leq \tau(n)$, define ${\cal C}_l:=\lbrace 1 \leqslant x\leqslant n \; : \; (x,n)={\cal D}_l\rbrace$. In this paper, by combining ideas from the finite Fourier transform of arithmetic functions and Ramanujan sums, we give a...

The weighted bipartite B-matching (WBM) problem models a host of data management applications, ranging from recommender systems to Internet advertising and e-commerce. Many of these applications, however, demand versatile assignment constraints, which WBM is weak at modelling.
In this paper, we investigate powerful generalisations of WBM. We first...

Portable smart devices have become prevalent and are used for ubiquitous access to the Internet in our daily life. Taking advantage of this trend, brick-and-mortar retailers have been increasingly deploying free Wi-Fi hotspots to provide easy Internet access for their customers. This opens the opportunity for retailers to collect customer informati...

In this paper, using properties of Ramanujan sums and of the discrete Fourier transform of arithmetic functions, we give an explicit formula for the number of solutions of the linear congruence , with ( ), where ( ) are arbitrary integers. As a consequence, we derive necessary and sufficient conditions under which the above restricted linear congru...

Let Z
<sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sub>
[i] be the ring of Gaussian integers modulo a positive integer n. Very recently, Camarero and Martinez et al. showed that for every prime number p > 5 such that p ≡ ±5 (mod 12), the Cayley graph ς
<sub xmlns:mml="http://www.w3.org/1998/Math/...

We study connectivity relations among points, where the precise location of each input point lies in a region of uncertainty. We distinguish two fundamental scenarios under which uncertainty arises. In the favorable Best-Case Uncertainty, each input point can be chosen from a given set to yield the best possible objective value. In the unfavorable...

In ICDE'12, Afrati, Das Sarma, Menestrina, Parameswaran and Ullman proposed similarity join algorithms for MapReduce. In this paper, we evaluate and extend their research, testing their proposed algorithms using edit distance and Jaccard similarity. We provide details of adaptations needed to implement their algorithms based on these similarity mea...

The weighted bipartite b-matching problem (WBM) plays a significant role in many real-world applications, including resource allocation, scheduling, Internet advertising, and E-commerce. WBM has been widely studied and efficient matching algorithms are well known. In this work, we study a novel variant of WBM, called conflict-aware WBM (CA-WBM), wh...

Universal hash functions, discovered by Carter and Wegman in 1979, are of great importance in computer science with many applications. MMH* is a well-known △-universal hash function family, based on the evaluation of a dot product modulo a prime. In this paper, we introduce a generalization of MMH*, that we call GMMH*, using the same construction a...

Graphs embedded into surfaces have many important applications, in particular, in combinatorics, geometry, and physics. For example, ribbon graphs and their counting is of great interest in string theory and quantum field theory (QFT). Recently, Koch, Ramgoolam, and Wen [Nuclear Phys.B 870 (2013), 530–581] gave a refined formula for counting ribbon...

We propose generic constructions of public-key encryption schemes, satisfying key-dependent message (KDM) security for projections and different forms of key-leakage resilience, from CPA-secure private-key encryption schemes with two main abstract properties: (1) a form of (additive) homomorphism with respect to both plaintexts and randomness, and...

Studying the topology of a network is critical to inferring underlying dynamics such as tolerance to failure, group behavior and spreading patterns. k-core decomposition is a well-established metric which partitions a graph into layers from external to more central vertices. In this paper we aim to explore whether k-core decomposition of large netw...

Studying the topology of a network is critical to inferring underlying dynamics such as tolerance to failure, group behavior and spreading patterns. k-core decomposition is a well-established metric which partitions a graph into layers from external to more central vertices. In this paper we aim to explore whether k-core decomposition of large netw...

We report experimental results for the MapReduce algorithms proposed by Afrati, Das Sarma, Menestrina, Parameswaran and Ullman in ICDE'12 to compute fuzzy joins of binary strings using Hamming Distance. Their algorithms come with complete theoretical analysis, however, no experimental evaluation is provided. They argue that there is a tradeoff betw...

In an emerging trend, more and more Internet users search for information from Community Question and Answer (CQA) websites, as interactive communication in such websites provides users with a rare feeling of trust. More often than not, end users look for instant help when they browse the CQA websites for the best answers. Hence, it is imperative t...

Universal hashing, discovered by Carter and Wegman in 1979, has many
applications in computer science, for example, in cryptography, randomized
algorithms, dictionary data structures etc. The following family is a famous
construction: Let $p$ be a prime and $k$ be a positive integer. Define
\begin{align*} \text{MMH}^*:=\lbrace g_{\mathbf{x}} \; : \...

We study the problem of anonymizing data by column suppression. Meyerson and Williams showed that this problem is NP-hard for . The complexity of this problem for remained open. In this note, we show that 2-anonymizing data by suppressing the minimum number of columns is also NP-hard. In fact, we prove a stronger claim that this problem is NP-hard...

We initiate a systematic study to help distinguish a special group of online users, called hidden paid posters, or termed “Internet water army” in China, from the legitimate ones. On the Internet, the paid posters represent a new type of online job opportunities. They get paid for posting comments or articles on different online communities and web...

In this work, we study the problem of clearing contamination spreading
through a large network where we model the problem as a graph searching game.
The problem can be summarized as constructing a search strategy that will leave
the graph clear of any contamination at the end of the searching process in as
few steps as possible. We show that this p...

The majority of recommender systems are designed to recommend items (such as
movies and products) to users. We focus on the problem of recommending buyers
to sellers which comes with new challenges: (1) constraints on the number of
recommendations buyers are part of before they become overwhelmed, (2)
constraints on the number of recommendations se...

We study three-way joins on MapReduce. Joins are very useful in a multitude
of applications from data integration and traversing social networks, to mining
graphs and automata-based constructions. However, joins are expensive, even for
moderate data sets; we need efficient algorithms to perform distributed
computation of joins using clusters of man...

Since the Netflix Prize competition, latent factor models (LFMs) have become the comparison "staples" for many of the recent recommender methods. The performance improvement of LFMs over baseline approaches, however, hovers at only low percentage numbers. Therefore, it is time for a better understanding of their real power beyond the overall RMSE (...

Regret minimizing sets are a recent approach to representing a dataset D by a small subset R of size r of representative data points. The set R is chosen such that executing any top-1 query on R rather than D is minimally perceptible to any user. However, such a subset R may not exist, even for modest sizes, r. In this paper, we introduce the relax...

We consider the recently introduced monochromatic reverse top−k query which asks for, given a (possibly new) tuple q and a dataset \(\mathcal{D}\), all possible top−k queries on \(\mathcal{D}\cup\{q\}\) for which q is in the result. Towards this problem, we introduce the first query-agnostic approach, which leads to an efficient index. We present t...

For a graph-based representation of a social network, the identity of participants can be uniquely determined if an adversary has background structural knowledge about the graph. We focus on degree-based attacks, wherein the adversary knows the degrees of particular target vertices and we aim to protect the anonymity of participants through k-anony...

In an emerging trend, more and more Internet users search for information
from Community Question and Answer (CQA) websites, as interactive communication
in such websites provides users with a rare feeling of trust. More often than
not, end users look for instant help when they browse the CQA websites for the
best answers. Hence, it is imperative t...