Wanna improve process mining results? : it’s high time we consider data quality issues seriously

R.P. Jagadeesh Chandra Bose, R.S. Mans, W.M.P. Aalst, van der

Research output: Book/ReportReportAcademic

Abstract

The growing interest in process mining is fueled by the growing availability of event data. Process mining techniques use event logs to automatically discover process models, check conformance, identify bottlenecks and deviations, suggest improvements, and predict processing times. Lion's share of process mining research has been devoted to analysis techniques. However, the proper handling of problems and challenges arising in analyzing event logs used as input is critical for the success of any process mining effort. In this paper, we identify four categories of process characteristics issues that may manifest in an event log (e.g. process problems related to event granularity and case heterogeneity) and 27 categories of event log quality issues (e.g., problems related to timestamps in event logs, imprecise activity names, and missing events). The systematic identification and analysis of these issues calls for a consolidated effort from the process mining community. Five real-life event logs are analyzed to illustrate the omnipresence of process and event log issues. We hope that these findings will encourage systematic logging approaches (to prevent event log issues), repair techniques (to alleviate event log issues) and analysis techniques (to deal with the manifestation of process characteristics in event logs).
Original languageEnglish
PublisherBPMcenter. org
Number of pages28
Publication statusPublished - 2013

Publication series

NameBPM reports
Volume1302

Fingerprint

Dive into the research topics of 'Wanna improve process mining results? : it’s high time we consider data quality issues seriously'. Together they form a unique fingerprint.

Cite this