Sequential Pattern Mining in Data Streams Using the Weighted Sliding Window Model.
ABSTRACT Mining data streams for knowledge discovery is important to many applications, including Web click stream mining, network intrusion detection, and on-line transaction analysis. In this paper, by analyzing data characteristics, we propose an efficient algorithm SWSS (Sequential pattern mining with the weighted sliding window model in SPAM) to mine frequent sequential patterns based on the weighted sliding windows model. This algorithm provides more space for users to specify which sequences they are more interested in. Extensive experiments show that the proposed algorithm is feasible and efficient for mining all sequential patterns as users specified.
- Data Engineering, 2001. Proceedings. 17th International Conference on; 02/2001
Conference Paper: CloSpan: Mining Closed Sequential Patterns in Large Databases.[Show abstract] [Hide abstract]
ABSTRACT: Previous sequential pattern mining algorithms mine thefull set of frequent subsequences satisfying a rain_supthreshold in a sequence database. However, since afrequent long sequence contains a combinatorial numberof frequent subsequences, such mining will generatean explosive number of frequent subsequences for longpatterns, which is prohibitively expensive in both timeand space.Proceedings of the Third SIAM International Conference on Data Mining, San Francisco, CA, USA, May 1-3, 2003; 01/2003
- [Show abstract] [Hide abstract]
ABSTRACT: : The problem of mining sequential patterns was recently introduced in [AS95]. We are given a database of sequences, where each sequence is a list of transactions ordered by transaction-time, and each transaction is a set of items. The problem is to discover all sequential patterns with a user-specified minimum support, where the support of a pattern is the number of data-sequences that contain the pattern. An example of a sequential pattern is "5% of customers bought `Foundation' and `Ringworld' in one transaction, followed by `Second Foundation' in a later transaction". We generalize the problem as follows. First, we add time constraints that specify a minimum and/or maximum time period between adjacent elements in a pattern. Second, we relax the restriction that the items in an element of a sequential pattern must come from the same transaction, instead allowing the items to be present in a set of transactions whose transaction-times are within a user-specified time window. Third, g...