Using internal evaluation measures to validate the quality of diverse stream clustering algorithms

M. Hassani, T. Seidl

    Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

    156 Downloads (Pure)

    Samenvatting

    Measuring the quality of a clustering algorithm has shown to be as important as the algorithm itself. It is a crucial part of choosing the clustering algorithm that performs best for an input data. Streaming input data have many features that make them much more challenging than static ones. They are endless, varying and emerging with high speeds. This raised new challenges for the clustering algorithms as well as for their evaluation measures. Up till now, external evaluation measures were exclusively used for validating stream clustering algorithms. While external validation requires a ground truth which is not provided in most applications, particularly in the streaming case, internal clustering validation is efficient and realistic. In this article, we analyze the properties and performances of eleven internal clustering measures. In particular, we apply these measures to carefully synthesized stream scenarios to reveal how they react to clusterings on evolving data streams using both k-means-based and density-based clustering algorithms. A series of experimental results show that different from the case with static data, the Calinski-Harabasz index performs the best in coping with common aspects and errors of stream clustering for k-means-based algorithms, while the revised validity index performs the best for density-based ones.
    Originele taal-2Engels
    Pagina's (van-tot)171–183
    Aantal pagina's13
    TijdschriftVietnam Journal of Computer Science
    Volume4
    Nummer van het tijdschrift3
    DOI's
    StatusGepubliceerd - 1 aug 2017

    Vingerafdruk Duik in de onderzoeksthema's van 'Using internal evaluation measures to validate the quality of diverse stream clustering algorithms'. Samen vormen ze een unieke vingerafdruk.

    Citeer dit