Native directly follows operator

Research output: Contribution to journalArticleAcademic

21 Downloads (Pure)

Abstract

Typical legacy information systems store data in relational databases. Process mining is a research discipline that analyzes this data to obtain insights into processes. Many different process mining techniques can be applied to data. In current techniques, an XES event log serves as a basis for analysis. However, because of the static characteristic of an XES event log, we need to create one XES file for each process mining question, which leads to overhead and inflexibility. As an alternative, people attempt to perform process mining directly on the data source using so-called intermediate structures. In previous work, we investigated methods to build intermediate structures on source data by executing a basic SQL query on the database. However, the nested form in the SQL query can cause performance issues on the database side. Therefore, in this paper, we propose a native SQL operator for direct process discovery on relational databases. We define a native operator for the simplest form of the intermediate structure, called the "directly follows relation". This approach has been evaluated with big event data and the experimental results show that it performs faster than the state-of-the-art of database approaches.
Original languageEnglish
Number of pages12
JournalarXiv
Publication statusPublished - 5 Jun 2018

Fingerprint

Information systems

Cite this

@article{45718d7e816b4cabb6a7011bfbd7f10d,
title = "Native directly follows operator",
abstract = "Typical legacy information systems store data in relational databases. Process mining is a research discipline that analyzes this data to obtain insights into processes. Many different process mining techniques can be applied to data. In current techniques, an XES event log serves as a basis for analysis. However, because of the static characteristic of an XES event log, we need to create one XES file for each process mining question, which leads to overhead and inflexibility. As an alternative, people attempt to perform process mining directly on the data source using so-called intermediate structures. In previous work, we investigated methods to build intermediate structures on source data by executing a basic SQL query on the database. However, the nested form in the SQL query can cause performance issues on the database side. Therefore, in this paper, we propose a native SQL operator for direct process discovery on relational databases. We define a native operator for the simplest form of the intermediate structure, called the {"}directly follows relation{"}. This approach has been evaluated with big event data and the experimental results show that it performs faster than the state-of-the-art of database approaches.",
keywords = "cs.DB",
author = "Alifah Syamsiyah and Dongen, {Boudewijn F. van} and Dijkman, {Remco M.}",
year = "2018",
month = "6",
day = "5",
language = "English",
journal = "arXiv",
publisher = "Cornell University Library",

}

Native directly follows operator. / Syamsiyah, Alifah; Dongen, Boudewijn F. van; Dijkman, Remco M.

In: arXiv, 05.06.2018.

Research output: Contribution to journalArticleAcademic

TY - JOUR

T1 - Native directly follows operator

AU - Syamsiyah, Alifah

AU - Dongen, Boudewijn F. van

AU - Dijkman, Remco M.

PY - 2018/6/5

Y1 - 2018/6/5

N2 - Typical legacy information systems store data in relational databases. Process mining is a research discipline that analyzes this data to obtain insights into processes. Many different process mining techniques can be applied to data. In current techniques, an XES event log serves as a basis for analysis. However, because of the static characteristic of an XES event log, we need to create one XES file for each process mining question, which leads to overhead and inflexibility. As an alternative, people attempt to perform process mining directly on the data source using so-called intermediate structures. In previous work, we investigated methods to build intermediate structures on source data by executing a basic SQL query on the database. However, the nested form in the SQL query can cause performance issues on the database side. Therefore, in this paper, we propose a native SQL operator for direct process discovery on relational databases. We define a native operator for the simplest form of the intermediate structure, called the "directly follows relation". This approach has been evaluated with big event data and the experimental results show that it performs faster than the state-of-the-art of database approaches.

AB - Typical legacy information systems store data in relational databases. Process mining is a research discipline that analyzes this data to obtain insights into processes. Many different process mining techniques can be applied to data. In current techniques, an XES event log serves as a basis for analysis. However, because of the static characteristic of an XES event log, we need to create one XES file for each process mining question, which leads to overhead and inflexibility. As an alternative, people attempt to perform process mining directly on the data source using so-called intermediate structures. In previous work, we investigated methods to build intermediate structures on source data by executing a basic SQL query on the database. However, the nested form in the SQL query can cause performance issues on the database side. Therefore, in this paper, we propose a native SQL operator for direct process discovery on relational databases. We define a native operator for the simplest form of the intermediate structure, called the "directly follows relation". This approach has been evaluated with big event data and the experimental results show that it performs faster than the state-of-the-art of database approaches.

KW - cs.DB

M3 - Article

JO - arXiv

JF - arXiv

ER -