BFSPMiner: an effective and efficient batch-free algorithm for mining sequential patterns over data streams

M. Hassani (Corresponding author), D. Töws, A. Cuzzocrea, T. Seidl

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

19 Citaten (Scopus)
121 Downloads (Pure)

Samenvatting

Supporting sequential pattern mining from data streams is nowadays a relevant problem in the area of data stream mining research. Actual proposals available in the literature are based on the well-known PrefixSpan approach and are, indeed, able to effectively bound the error of discovered patterns. This approach foresees the idea of dividing the target stream in a collection of manageable chunks, i.e., pieces of stream, in order to gain into effectiveness and efficiency. Unfortunately, mining patterns from stream chunks indeed introduce additional errors with respect to the basic application scenario where the target stream is mined continuously, in a non-batch manner. This is due to several reasons. First, since batches are processed individually, patterns that contain items from two consecutive batches are lost. Secondly, in most batch-based approaches, the decision about the frequency of a pattern is done locally inside a single batch. Thus, if a pattern is frequent in the stream but its items are scattered over different batches, it will be continuously pruned out and will never become frequent due to the algorithm’s lack of the “complete-picture” perspective. In order to address so-delineated pattern mining problems, this paper introduces and experimentally assesses BFSPMiner, a Batch-Free Sequential Pattern Miner algorithm for effectively and efficiently mining patterns in streams without being constrained to the traditional batch-based processing. This allows us, for instance, to discover frequent patterns that would be lost according to alternative batch-based stream mining processing models. We complement our analytical contributions by means of a comprehensive experimental campaign of BFSPMiner against real-world data stream sets and in comparison with current batch-based stream sequential pattern mining algorithms.

Keywords
Sequential pattern mining Data streams Batch-free
Originele taal-2Engels
Pagina's (van-tot)223-239
Aantal pagina's17
TijdschriftInternational Journal of Data Science and Analytics
Volume8
Nummer van het tijdschrift3
DOI's
StatusGepubliceerd - 1 okt. 2019
Evenement6th International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (KDD 2017) - Halifax, Canada
Duur: 14 aug. 201714 aug. 2017

Vingerafdruk

Duik in de onderzoeksthema's van 'BFSPMiner: an effective and efficient batch-free algorithm for mining sequential patterns over data streams'. Samen vormen ze een unieke vingerafdruk.

Citeer dit