Abstract
In this work we describe a new algorithm to mine tree structured data. Our method computes an almost smallest supertree, based upon iteratively employing tree alignment. This supertree is a global pattern, that can be used both for descriptive and predictive data mining tasks. Experiments performed on two real datasets, show that our approach leads to a drastic compression of the database. Furthermore, when the resulting pattern is used for classification, the results show a considerable improvement over existing algorithms.Moreover, the incremental nature of the algorithm provides a flexible way of dealing with extension or reduction of the original dataset. Finally, the computation of the almost smallest supertree can be easily parallelized.
Original language | English |
---|---|
Title of host publication | Proceedings of teh SIAM International Conference on Data Mining (SDM 2008, Atlanta GA, USA, April 24-26, 2008) |
Publisher | Society for Industrial and Applied Mathematics (SIAM) |
Pages | 61-71 |
Publication status | Published - 2008 |