Paniz Abedin's research while affiliated with Florida Polytechnic University and other places

Publications (11)

Article
Full-text available
Let T1 and T2 be two rooted trees with an equal number of leaves. The leaves are labeled, and the labeling of the leaves in T2 is a permutation of those in T1. Nodes are associated with weight, such that the weight of a node u, denoted by W(u), is more than the weight of its parent. A node x∈T1 and a node y∈T2 are induced, iff their subtrees have a...
Article
The non-overlapping indexing problem is defined as follows: pre-process a given text T[1,n] of length n into a data structure such that whenever a pattern P[1,m] comes as an input, we can efficiently report the largest set of non-overlapping occurrences of P in T. The best-known solution is by Cohen and Porat [ISAAC 2009]. The size of their structu...
Article
Full-text available
Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as...
Article
Full-text available
The shortest unique substring (SUS) problem is an active line of research in the field of string algorithms and has several applications in bioinformatics and information retrieval. The initial version of the problem was proposed by Pei et al. [ICDE’13]. Over the years, many variants and extensions have been pursued, which include positional-SUS, i...
Article
Let T[1,n] be a text of length n and T[i,n] be the suffix starting at position i. Also, for any two strings X and Y, let LCP(X,Y) denote their longest common prefix. The range-LCP of T w.r.t. a range [α,β], where 1≤α<β≤n isrlcp(α,β)=max⁡{|LCP(T[i,n],T[j,n])||i≠jandi,j∈[α,β]} Amir et al. [ISAAC 2011] introduced the indexing version of this problem,...
Chapter
Let \(\mathsf {T}[1,n]\) be a string of length n and \(\mathsf {T}[i,j]\) be the substring of \(\mathsf {T}\) starting at position i and ending at position j. A substring \(\mathsf {T}[i,j]\) of \(\mathsf {T}\) is a repeat if it occurs more than once in \(\mathsf {T}\); otherwise, it is a unique substring of \(\mathsf {T}\). Repeats and unique subs...
Conference Paper
\beginthebibliography 1 \bibitemalzamel2017faster M. Alzamel, P. Charalampopoulos, C. S. Iliopoulos, S. P. Pissis, J. Radoszewski, and W.-K. Sung. \newblock Faster algorithms for 1-mappability of a sequence. \newblock In \em International Conference on Combinatorial Optimization and Applications, pages 109--121. Springer, 2017. \bibitemderrien2012f...
Chapter
Let \(\mathsf {T}[1,n]\) be a text of length n and \(\mathsf {T}[i,n]\) be the suffix starting at position i. Also, for any two strings X and Y, let \(\mathsf {LCP}(X, Y)\) denote their longest common prefix. The range-LCP of \(\mathsf {T}\) w.r.t. a range \([\alpha ,\beta ]\), where \(1\le \alpha < \beta \le n\) is Open image in new window Amir et...
Preprint
Full-text available
The Average Common Substring (ACS) is a popular alignment-free distance measure for phylogeny reconstruction. The ACS can be computed in O(n) space and time, where n=x+y is the input size. The compressed string matching is the study of string matching problems with the following twist: the input data is in a compressed format and the underling task...

Citations

... In addition to optimizing and testing KATKA, we are also investigating adapting results [2,1,3] about using LZ77-indexes to find the longest common substring and the maximal exact matches of P and the genomes in T , in order to find their subtrees instead of the subtrees for k-mers. Finally, we are investigating adapting results [6] using LZ77-indexes for document-listing, in order to find the number of genomes in T in which each k-mer of P occurs. ...
... In the related range shortest unique substring problem, defined by Abedin et al. [2], the task is to construct a data structure over T to be able to answer the following type of online queries efficiently. Given a range [i, j], return a shortest string with exactly one occurrence (starting position) in [i, j]. ...
... Although not mentioned explicitly in [16], the size of their data structure (except for the input string) and query time can be respectively written as O(m) space and O( √ log m∕ log log m + occ) time with respect to the number m of MUSs of the input string T. Note that all the above algorithms for the SUS problems compute all MUSs of the given string (or some data structure which is essentially equivalent to MUSs) in the preprocessing. We also refer to [1,3,7,17] for related results on the SUS problems. ...
... This version can be applied in computational biology, where factors such as genetic mutation and experimental error make approximate string matching necessary [9]. Another useful application of this approximate version is in computing average common substring, which has been considered as an approach to phylogenomic reconstruction [37,38]. In order to estimate the evolutionary distance between pairs of primate genomes, Thankachan et al. [39] showed that adding a similar k-mismatch parameter to average common substring finding equation leads to better results [9]. ...
... In fact, Kociumaka et al. [44], using their efficient IPM queries as a subroutine, managed to show efficient solutions for other internal problems, such as for computing the periods of a substring (period queries, introduced in [43]), and for checking whether two substrings are rotations of one another (cyclic equivalence queries). Other problems that have been studied in the internal setting include string alignment [58,18], approximate pattern matching [21], dictionary matching [20,19], longest common substring [4], counting palindromes [55], range longest common prefix [3,1,46,34], the computation of the lexicographically minimal or maximal suffix, and minimal rotation [6,41], as well as of the lexicographically kth suffix [7]. We refer the interested reader to the Ph.D dissertation of Kociumaka [42], for a nice exposition. ...