An experimental comparison of different inclusion relations in frequent tree mining

J. Knijf, de, A.J. Feelders

    Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

    Samenvatting

    In recent years a variety of mining algorithms, to derive all frequent subtrees from a database of labeled ordered rooted trees has been developed. These algorithms share properties such as enumeration strategies and pruning techniques. They differ however in the tree inclusion relation used and the way attribute values are dealt with. In this work we investigate the different approaches with respect to 'usefulness' of the derived patterns, in particular, the performance of classifiers that use the derived patterns as features. In order to find a good trade-off between expressiveness and runtime performance of the different approaches, we also take the complexity of the different classifiers into account, as well as the run time and memory usage of the different approaches. The experiments are performed on two real data sets, and two synthetic data sets. The results show that significant improvement in both predictive performance and computational efficiency can be gained by choosing the right tree mining approach.
    Originele taal-2Engels
    Pagina's (van-tot)1-22
    TijdschriftFundamenta Informaticae
    Volume89
    Nummer van het tijdschrift1
    StatusGepubliceerd - 2008

    Vingerafdruk Duik in de onderzoeksthema's van 'An experimental comparison of different inclusion relations in frequent tree mining'. Samen vormen ze een unieke vingerafdruk.

    Citeer dit