A Fast Biological Data Mining Algorithm Based on Embedded Frequent Subtree
ABSTRACT In this paper, we present a fast biological data mining algorithm named IRTM based on embedded frequent subtree. We also advance a string encoding method for representing the trees, a scope-list for extending all substrings and some pruning rules which can further reduce the computational time and space cost. Experimental results show that IRTM algorithm can achieve significantly performance improvement over previous works.
- SourceAvailable from: psu.eduFundam. Inform. 01/2005; 66:161-198.
- [show abstract] [hide abstract]
ABSTRACT: We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel estimation and pruning techniques. We also present results of applying this algorithm to sales data obtained from a large retailing company, which shows the effectiveness of the algorithm.Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, edited by Peter Buneman, Sushil Jajodia, 01/1993: pages 207--216; ACM Press.
Conference Proceeding: Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining.[show abstract] [hide abstract]
ABSTRACT: Mining frequent tree patterns is an important research prob- lems with broad applications in bioinformatics, digital library, e-commerce, and so on. Previous studies highly suggested that pattern-growth meth- ods are e-cient in frequent pattern mining. In this paper, we systemat- ically develop the pattern growth methods for mining frequent tree pat- terns. Two algorithms, Chopper and XSpanner, are devised. An extensive performance study shows that the two newly developed algorithms out- perform TreeMinerV (13), one of the fastest methods proposed before, in mining large databases. Furthermore, algorithm XSpanner is substan- tially faster than Chopper in many cases.Advances in Knowledge Discovery and Data Mining, 8th Pacific-Asia Conference, PAKDD 2004, Sydney, Australia, May 26-28, 2004, Proceedings; 01/2004