Publications (6)0 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: The substring traversal problem is the problem of enumerating all branching substrings appearing in a given text. Although this problem is easily solvable with the suffix tree of McCreight (1976), a space efficient and practically fast solution is important. We devise a simple and efficient algorithm that simulates the traversal of the suffix tree for a given text with the suJfix' arra!l of Manbet and Meyers (1993) and Gonnet, BaezaYates, Snider (1992) The algorithm runs in O(n) time and 5 bulk I/O with  the suffix array and an additional structure called the height arra!l, while the naive algorithm using binary search on the suffix array requires O(n 2) time in the worst case. The space requirement 7N bytes of our algorithm is smaller than 15N bytes of the traversal algorithm with the suffix tree. A linear time algerthru for computing the height array from the suffix and the height arrays is also presented. Computer experiments on real datasets showed that our traversal algorithm with the suffix array is an order of magnitude faster than the naive simulation method and comparable to the traversal algorithm with the suffix tree. 
Conference Paper: LinearTime LongestCommonPrefix Computation in Suffix Arrays and Its Applications.
[Show abstract] [Hide abstract]
ABSTRACT: We present a lineartime algorithm to compute the longest common prefix information in suffix arrays. As two applications of our algorithm, we show that our algorithm is crucial to the effective use of blocksorting compression, and we present a lineartime algorithm to simulate the bottomup traversal of a suffix tree with a suffix array combined with the longest common prefix information.Combinatorial Pattern Matching, 12th Annual Symposium, CPM 2001 Jerusalem, Israel, July 14, 2001 Proceedings; 01/2001  [Show abstract] [Hide abstract]
ABSTRACT: We present a lineartime algorithm to compute the longest common prefix information in suffix arrays. As two applications of our algorithm, we show that our algorithm is crucial to the effective use of blocksorting compression, and we present a lineartime algorithm to sim ulate the bottomup traversal of a suffix tree with a suffix array combined with the longest common prefix information.12/2000: pages 181192;  [Show abstract] [Hide abstract]
ABSTRACT: This paper considers knowledge discovery by sort regular patterns, which are strings over sort letters representing finite sets of basic letters. We devise a learning algorithm for the class based on the minimal multiple generalization technique, and evaluate the method by experiments on biosequences from GenBank database. The experiments show that relatively a simple sort pattern can represent a complex motif in biosequences, and the learning algorithm works well in noisy examples. 1 Introduction Discovery of consensus motifs plays an important role in automatic analysis of biosequences. A consensus motif is a description of common syntactic features, in terms of sequences, of a group of biosequences that are believed to be biologically related in structures, functions, or origins. Therefore, discovery of consensus motifs is an important step to understand the meaning of biosequences. We consider the discovery of consensus motifs by a class of simple string patterns. In bioinf...  [Show abstract] [Hide abstract]
ABSTRACT: efficient algorithm that finds a best pattern that maximizes the precision and runs in O(k d01 n log d+1 n) expected time and with O(k d01 n) space for uniformly random texts. We also discuss an implementation technique for the algorithm on the suffix array indexing structure. data mining, maximum agreement problem 1 $O$8$a$K %G!<%?%^%$%K%s%0 (Data mining) $H$O!$%G!<%? %Y!<%9$KC_@Q$5$l$?BgNL$N%G!<%?$+$i!$<+L@$G $J$$5,B'@$d%Q%?%s$rH><+F0E*$K$H$j$@$9J}K! $K$D$$$F$N2J3XE*8&5f$G$"$k!%%G!<%?%^%$%K%s %0$O!$8=:_!$%S%8%M%9J,Ln$d2J3X5;=QJ,Ln$r$O $8$a$H$9$k$5$^$6$^$JBP>]J,Ln$G!$$=$NE,MQ$, @9$s$K$*$3$J$o$l$F$$$k!%6aG/!$H/E8$NCx$7$$ %F%%9%H%G!<%?%Y!<%9$K4X$7$F$O!$ 1. L@<(E*$J9=B$$r$b$?$J$$!$ 2. B?MM$JFbMF$r$b$DEE;R2=J8=q$N!$ 3. ?t%.%,%P%$%H$+$i?t%F%i%P%$%H$K$*$  01/1999;
Publication Stats
374  Citations  
Top Journals
Institutions

2000

Kyushu University
Hukuoka, Fukuoka, Japan
