A monotonic on-line linear algorithm for hierarchical agglomerative classification

A.B. Dragut, C.M. Nichitiu

Research output: Contribution to journalArticleAcademicpeer-review

1 Downloads (Pure)

Abstract

We start from an algorithm for on-line linear hierarchical classification for multidimensional data, using a centroid aggregation criterion. After evoking some real-life on-line settings where it can be used, we analyze it mathematically, in the framework of the Lance–Williams algorithms, proving that it does not have some useful properties: it is not monotonic, nor space-conserving. In order to use its on-line capabilities, we modify it and show that it becomes monotonic. While still not having the internal similarity-external dissimilarity property, the worst case classifications of the new algorithm are correctable with an additional small computational effort, on the overall taking O(nk) time for n points and k classes. Experimental study confirm the theoretical improvements upon the initial algorithm. A theoretical and experimental comparison to other algorithms from the literature, shows that it is among the fastest and performs well.
Original languageEnglish
Pages (from-to)111-141
JournalInformation Technology and Management
Volume5
Issue number1-2
DOIs
Publication statusPublished - 2004

Fingerprint

aggregation
Agglomeration
literature
time
Experimental study
Dissimilarity

Cite this

@article{e3cc4409c2984a1086ba4ce125f7ceb3,
title = "A monotonic on-line linear algorithm for hierarchical agglomerative classification",
abstract = "We start from an algorithm for on-line linear hierarchical classification for multidimensional data, using a centroid aggregation criterion. After evoking some real-life on-line settings where it can be used, we analyze it mathematically, in the framework of the Lance–Williams algorithms, proving that it does not have some useful properties: it is not monotonic, nor space-conserving. In order to use its on-line capabilities, we modify it and show that it becomes monotonic. While still not having the internal similarity-external dissimilarity property, the worst case classifications of the new algorithm are correctable with an additional small computational effort, on the overall taking O(nk) time for n points and k classes. Experimental study confirm the theoretical improvements upon the initial algorithm. A theoretical and experimental comparison to other algorithms from the literature, shows that it is among the fastest and performs well.",
author = "A.B. Dragut and C.M. Nichitiu",
year = "2004",
doi = "10.1023/B:ITEM.0000008078.09272.89",
language = "English",
volume = "5",
pages = "111--141",
journal = "Information Technology and Management",
issn = "1385-951X",
publisher = "Kluwer Academic Publishers",
number = "1-2",

}

A monotonic on-line linear algorithm for hierarchical agglomerative classification. / Dragut, A.B.; Nichitiu, C.M.

In: Information Technology and Management, Vol. 5, No. 1-2, 2004, p. 111-141.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - A monotonic on-line linear algorithm for hierarchical agglomerative classification

AU - Dragut, A.B.

AU - Nichitiu, C.M.

PY - 2004

Y1 - 2004

N2 - We start from an algorithm for on-line linear hierarchical classification for multidimensional data, using a centroid aggregation criterion. After evoking some real-life on-line settings where it can be used, we analyze it mathematically, in the framework of the Lance–Williams algorithms, proving that it does not have some useful properties: it is not monotonic, nor space-conserving. In order to use its on-line capabilities, we modify it and show that it becomes monotonic. While still not having the internal similarity-external dissimilarity property, the worst case classifications of the new algorithm are correctable with an additional small computational effort, on the overall taking O(nk) time for n points and k classes. Experimental study confirm the theoretical improvements upon the initial algorithm. A theoretical and experimental comparison to other algorithms from the literature, shows that it is among the fastest and performs well.

AB - We start from an algorithm for on-line linear hierarchical classification for multidimensional data, using a centroid aggregation criterion. After evoking some real-life on-line settings where it can be used, we analyze it mathematically, in the framework of the Lance–Williams algorithms, proving that it does not have some useful properties: it is not monotonic, nor space-conserving. In order to use its on-line capabilities, we modify it and show that it becomes monotonic. While still not having the internal similarity-external dissimilarity property, the worst case classifications of the new algorithm are correctable with an additional small computational effort, on the overall taking O(nk) time for n points and k classes. Experimental study confirm the theoretical improvements upon the initial algorithm. A theoretical and experimental comparison to other algorithms from the literature, shows that it is among the fastest and performs well.

U2 - 10.1023/B:ITEM.0000008078.09272.89

DO - 10.1023/B:ITEM.0000008078.09272.89

M3 - Article

VL - 5

SP - 111

EP - 141

JO - Information Technology and Management

JF - Information Technology and Management

SN - 1385-951X

IS - 1-2

ER -