Information-preserving abstractions of event data in process mining

Sander J.J. Leemans (Corresponding author), Dirk Fahland

Research output: Contribution to journalArticleAcademicpeer-review

14 Citations (Scopus)
1 Downloads (Pure)


Process mining aims at obtaining information about processes by analysing their past executions in event logs, event streams, or databases. Discovering a process model from a finite amount of event data thereby has to correctly infer infinitely many unseen behaviours. Thereby, many process discovery techniques leverage abstractions on the finite event data to infer and preserve behavioural information of the underlying process. However, the fundamental information-preserving properties of these abstractions are not well understood yet. In this paper, we study the information-preserving properties of the “directly follows” abstraction and its limitations. We overcome these by proposing and studying two new abstractions which preserve even more information in the form of finite graphs. We then show how and characterize when process behaviour can be unambiguously recovered through characteristic footprints in these abstractions. Our characterization defines large classes of practically relevant processes covering various complex process patterns. We prove that the information and the footprints preserved in the abstractions suffice to unambiguously rediscover the exact process model from a finite event log. Furthermore, we show that all three abstractions are relevant in practice to infer process models from event logs and outline the implications on process mining techniques.

Original languageEnglish
Pages (from-to)1143–1197
Number of pages55
JournalKnowledge and Information Systems
Issue number3
Publication statusPublished - 1 Mar 2020


  • Directly follows
  • Inclusive choice
  • Information preservation
  • Language abstraction
  • Minimum self-distance
  • Model abstraction
  • Process mining
  • Rediscoverability


Dive into the research topics of 'Information-preserving abstractions of event data in process mining'. Together they form a unique fingerprint.

Cite this