Distributed process discovery and conformance checking

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

44 Citations (Scopus)


Process mining techniques have matured over the last decade and more and more organization started to use this new technology. The two most important types of process mining are process discovery (i.e., learning a process model from example behavior recorded in an event log) and conformance checking (i.e., comparing modeled behavior with observed behavior). Process mining is motivated by the availability of event data. However, as event logs become larger (say terabytes), performance becomes a concern. The only way to handle larger applications while ensuring acceptable response times, is to distribute analysis over a network of computers (e.g., multicore systems, grids, and clouds). This paper provides an overview of the different ways in which process mining problems can be distributed. We identify three types of distribution: replication, a horizontal partitioning of the event log, and a vertical partitioning of the event log. These types are discussed in the context of both procedural (e.g., Petri nets) and declarative process models. Most challenging is the horizontal partitioning of event logs in the context of procedural models. Therefore, a new approach to decompose Petri nets and associated event logs is presented. This approach illustrates that process mining problems can be distributed in various ways.
Original languageEnglish
Title of host publicationFundamental Approaches to Software Engineering (15th International Conference, FASE 2012, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2012, Tallinn, Estonia, March 24 - April 1, 2012. Proceedings)
EditorsJ. Lara, de, A. Zisman
Place of PublicationBerlin
ISBN (Print)978-3-642-28871-5
Publication statusPublished - 2012

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743


Dive into the research topics of 'Distributed process discovery and conformance checking'. Together they form a unique fingerprint.

Cite this