Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs

S. Suriadi, Robert Andrews, A.H.M. ter Hofstede, M.T. Wynn

Research output: Contribution to journalArticleAcademicpeer-review

137 Citations (Scopus)
1 Downloads (Pure)

Abstract

Process-oriented data mining (process mining) uses algorithms and data (in the form of event logs) to construct models that aim to provide insights into organisational processes. The quality of the data (both form and content) presented to the modeling algorithms is critical to the success of the process mining exercise. Cleaning event logs to address quality issues prior to conducting a process mining analysis is a necessary, but generally tedious and ad hoc task. In this paper we describe a set of data quality issues, distilled from our experiences in conducting process mining analyses, commonly found in process mining event logs or encountered while preparing event logs from raw data sources. We show that patterns are used in a variety of domains as a means for describing commonly encountered problems and solutions. The main contributions of this article are in showing that a patterns-based approach is applicable to documenting commonly encountered event log quality issues, the formulation of a set of components for describing event log quality issues as patterns, and the description of a collection of 11 event log imperfection patterns distilled from our experiences in preparing event logs. We postulate that a systematic approach to using such a pattern repository to identify and repair event log quality issues benefits both the process of preparing an event log and the quality of the resulting event log. The relevance of the pattern-based approach is illustrated via application of the patterns in a case study and through an evaluation by researchers and practitioners in the field.
Original languageEnglish
Pages (from-to)132-150
Number of pages19
JournalInformation Systems
Volume64
DOIs
Publication statusPublished - 1 Mar 2017

Keywords

  • Data mining
  • Data quality
  • Event log preparation
  • Event log quality
  • Patterns
  • Process mining
  • Systematic data pre-processing

Fingerprint

Dive into the research topics of 'Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs'. Together they form a unique fingerprint.

Cite this