Sequential pattern mining (SPM) is a well-studied theme in data mining, in which one aims to discover common sequences of item sets in a large corpus of temporal itemset data. Due to the sequential nature of data streams, supporting SPM in streaming environments is commonly studied in the area of data stream mining as well. On the other hand, stream-based process discovery (PD), originating from the field of process mining, focusses on learning process models on the basis of online event data. In particular, the main goal of the models discovered is to describe the underlying generating process in an end-to-end fashion. As both SPM and PD use data that are comparable in nature, that is, both involve time-stamped instances, one expects that techniques from the SPM domain are (partly) transferable to the PD domain. However, thus far, little work has been done in the intersection of the two fields. In this focus article, we therefore study the possible application of SPM techniques in the context of PD. We provide an overview of the two fields, covering their commonalities and differences, highlight the challenges of applying them, and, present an outlook and several avenues for future work. This article is categorized under: Algorithmic Development > Spatial and Temporal Data Mining Fundamental Concepts of Data and Knowledge > Key Design Issues in Data Mining Fundamental Concepts of Data and Knowledge > Big Data Mining.
|Number of pages||12|
|Journal||Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery|
|Publication status||Published - Nov 2019|
- data streams
- distributed sequential pattern mining
- process mining
- sequential pattern mining