We start from an algorithm for on-line linear hierarchical classification for multidimensional data, using a centroid aggregation criterion. After evoking some real-life on-line settings where it can be used, we analyze it mathematically, in the framework of the Lance–Williams algorithms, proving that it does not have some useful properties: it is not monotonic, nor space-conserving. In order to use its on-line capabilities, we modify it and show that it becomes monotonic. While still not having the internal similarity-external dissimilarity property, the worst case classifications of the new algorithm are correctable with an additional small computational effort, on the overall taking O(nk) time for n points and k classes. Experimental study confirm the theoretical improvements upon the initial algorithm. A theoretical and experimental comparison to other algorithms from the literature, shows that it is among the fastest and performs well.
|Journal||Information Technology and Management|
|Publication status||Published - 2004|