Process mining techniques have matured over the last decade and more and more organization started to use this new technology. The two most important types of process mining are process discovery (i.e., learning a process model from example behavior recorded in an event log) and conformance checking (i.e., comparing modeled behavior with observed behavior). Process mining is motivated by the availability of event data. However, as event logs become larger (say terabytes), performance becomes a concern. The only way to handle larger applications while ensuring acceptable response times, is to distribute analysis over a network of computers (e.g., multicore systems, grids, and clouds). This paper provides an overview of the different ways in which process mining problems can be distributed. We identify three types of distribution: replication, a horizontal partitioning of the event log, and a vertical partitioning of the event log. These types are discussed in the context of both procedural (e.g., Petri nets) and declarative process models. Most challenging is the horizontal partitioning of event logs in the context of procedural models. Therefore, a new approach to decompose Petri nets and associated event logs is presented. This approach illustrates that process mining problems can be distributed in various ways.
|Title of host publication||Fundamental Approaches to Software Engineering (15th International Conference, FASE 2012, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2012, Tallinn, Estonia, March 24 - April 1, 2012. Proceedings)|
|Editors||J. Lara, de, A. Zisman|
|Place of Publication||Berlin|
|Publication status||Published - 2012|
|Name||Lecture Notes in Computer Science|