TY - BOOK
T1 - Process mining : A two-step approach to balance between underfitting and overfitting
AU - Aalst, van der, W.M.P.
AU - Rubin, V.A.
AU - Verbeek, H.M.W.
AU - Dongen, van, B.F.
AU - Kindler, E.
AU - Günther, C.W.
PY - 2008
Y1 - 2008
N2 - Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed.
One of the essential problems in process mining is that one cannot assume
to have seen all possible behavior. At best, one has seen a representative
subset. Therefore, classical synthesis techniques are not suitable as they
aim at finding a model that is able to exactly reproduce the log. Existing
process mining techniques try to avoid such "overfitting" by generalizing
the model to allow for more behavior. This generalization is often driven
by the representation language and very crude assumptions about com-
pleteness. As a result, parts of the model are over"fitting" (allow only
what has actually been observed) while other parts may be "underfitting" (allow for much more behavior without strong support for it). None
of the existing techniques enables the user to control the balance between
"overfitting" and "underfitting". To address this, we propose a two-step
approach. First, using a configurable approach, a transition system is
constructed. Then, using the "theory of regions", the model is synthesized. The approach has been implemented in the context of ProM and
overcomes many of the limitations of traditional approaches.
AB - Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed.
One of the essential problems in process mining is that one cannot assume
to have seen all possible behavior. At best, one has seen a representative
subset. Therefore, classical synthesis techniques are not suitable as they
aim at finding a model that is able to exactly reproduce the log. Existing
process mining techniques try to avoid such "overfitting" by generalizing
the model to allow for more behavior. This generalization is often driven
by the representation language and very crude assumptions about com-
pleteness. As a result, parts of the model are over"fitting" (allow only
what has actually been observed) while other parts may be "underfitting" (allow for much more behavior without strong support for it). None
of the existing techniques enables the user to control the balance between
"overfitting" and "underfitting". To address this, we propose a two-step
approach. First, using a configurable approach, a transition system is
constructed. Then, using the "theory of regions", the model is synthesized. The approach has been implemented in the context of ProM and
overcomes many of the limitations of traditional approaches.
M3 - Report
T3 - BPM reports
BT - Process mining : A two-step approach to balance between underfitting and overfitting
PB - BPMcenter. org
CY - Eindhoven/Brisbane
ER -