Process mining, a new business intelligence area, aims at discovering process models from event logs. Complex constructs, noise and infrequent behavior are issues that make process mining a complex problem. A genetic mining algorithm, which applies genetic operators to search in the space of all possible process models, deals with the aforementioned challenges with success. Its drawback is high computation time due to the high time costs of the fitness evaluation. Fitness evaluation time linearly depends on the number of process instances in the log. By using a sampling-based approach, i.e. evaluating fitness on a sample from the log instead of the whole log, we drastically reduce the computation time. When the desired fitness is achieved on the sample, we check the fitness on the whole log; if it is not achieved yet, we increase the sample size and continue the computation iteratively. Our experiments show that sampling works well even for relatively small logs, and the total computation time is reduced by 6 up to 15 times.
|Name||Lecture Notes in Computer Science|
|Conference||conference; 14th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES'2010); 2010-09-08; 2010-09-10|
|Period||8/09/10 → 10/09/10|
|Other||14th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES'2010)|