A RUN-TIME EFFICIENT IMPLEMENTATION OF COMPRESSED PATTERN MATCHING AUTOMATA
International Journal of Foundations of Computer Science (Impact Factor: 0.3). 04/2012; 20(04). DOI: 10.1142/S0129054109006838
We present a run-time efficient implementation of compressed pattern matching automata (CPMA) of Kida et al. (2003), where a text is given as a truncation-free collage system such that variable sequence is encoded by any prefix code. We first build CPMA directly from P and in time and space, and then convert it into the decoder-embedded CPMA (DECPMA), where |P| is the pattern length and is the number of variables defined in . The bound improves the bound achieved by a straightforward application of the method of Kida et al. We experimentally show that a combination of recursive-pairing compression and byte-oriented Huffman coding allows both a high compression ratio and a high speed CPM.
- [Show abstract] [Hide abstract]
ABSTRACT: A framework of context-sensitive grammar transform is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching (CPM) algorithm. The compression performance is a match for gzip and Re-Pair. The search speed of our CPM algorithm is almost twice faster than the KMP type CPM algorithm on Byte-Pair-Encoding by Shibata et al. (2000), and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al. (2002), which is regarded as one of the best combinations that allows a practically fast search.11/2008: pages 27-38;
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.