Mining exceptional relationships with grammar-guided genetic programming

J.M. Luna, M. Pechenizkiy, S. Ventura

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

9 Citaten (Scopus)
5 Downloads (Pure)

Samenvatting

Given a database of records, it might be possible to identify small subsets of data which distribution is exceptionally different from the distribution in the complete set of data records. Finding such interesting relationships, which we call exceptional relationships, in an automated way would allow discovering unusual or exceptional hidden behaviour. In this paper, we formulate the problem of mining exceptional relationships as a special case of exceptional model mining and propose a grammar-guided genetic programming algorithm (MERG3P) that enables the discovery of any exceptional relationships. In particular, MERG3P can work directly not only with categorical, but also with numerical data. In the experimental evaluation, we conduct a case study on mining exceptional relations between well-known and widely used quality measures of association rules, which exceptional behaviour would be of interest to pattern mining experts. For this purpose, we constructed a data set comprising a wide range of values for each considered association rule quality measure, such that possible exceptional relations between measures could be discovered. Thus, besides the actual validation of MERG3P, we found that the Support and Leverage measures in fact are negatively correlated under certain conditions, while in general experts in the field expect these measures to be positively correlated. Keywords: Association rules; Exceptional subgroups; Genetic programming
Originele taal-2Engels
Pagina's (van-tot)571-594
Aantal pagina's24
TijdschriftKnowledge and Information Systems
Volume47
Nummer van het tijdschrift3
DOI's
StatusGepubliceerd - jun 2016

    Vingerafdruk

Citeer dit