Sequential Pattern Mining in Data Streams Using the Weighted Sliding Window Model.
DOI: 10.1109/ICPADS.2009.64 Conference: IEEE 15th International Conference on Parallel and Distributed Systems, ICPADS 2009, 8-11 December 2009, Shenzhen, China
Mining data streams for knowledge discovery is important to many applications, including Web click stream mining, network intrusion detection, and on-line transaction analysis. In this paper, by analyzing data characteristics, we propose an efficient algorithm SWSS (Sequential pattern mining with the weighted sliding window model in SPAM) to mine frequent sequential patterns based on the weighted sliding windows model. This algorithm provides more space for users to specify which sequences they are more interested in. Extensive experiments show that the proposed algorithm is feasible and efficient for mining all sequential patterns as users specified.
- [Show abstract] [Hide abstract]
ABSTRACT: Previous studies indicate that key path mining is a significant task in software executing network analysis. Software executing network stream has not been studied yet, which can meaningfully reflect dynamic characteristics of software executing. And previous key path mining algorithms ignore pattern decay of streams and time-interval between function calls in software executing network on the pattern importance. Therefore, an algorithm called key path mining in software executing network stream KPMSEN-Stream is put forward in this paper. Firstly, software executing network stream is clearly defined as continuous and infinite software function calls with time stamps. Key paths are kept with the window sliding. Secondly, mining model of pattern decay is built based on continuous sliding windows. Thirdly, considering different importance with various time-intervals, time-interval weight is designed. Finally, key paths are obtained and kept by mining executing path in the dynamic network. Experimental results show that KPM-SENStream algorithm is efficient in finding key paths in software executing stream and has a good scalability. Furthermore, considering time-interval between software function calls and pattern decay, more interesting and accurate key paths can be found.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.