In this work we describe a new algorithm to mine tree structured data. Our method computes an almost smallest supertree, based upon iteratively employing tree alignment. This supertree is a global pattern, that can be used both for descriptive and predictive data mining tasks. Experiments performed on two real datasets, show that our approach leads to a drastic compression of the database. Furthermore, when the resulting pattern is used for classification, the results show a considerable improvement over existing algorithms.Moreover, the incremental nature of the algorithm provides a flexible way of dealing with extension or reduction of the original dataset. Finally, the computation of the almost smallest supertree can be easily parallelized.
|Title of host publication||Proceedings of teh SIAM International Conference on Data Mining (SDM 2008, Atlanta GA, USA, April 24-26, 2008)|
|Publisher||Society for Industrial and Applied Mathematics (SIAM)|
|Publication status||Published - 2008|