Internal clustering evaluation of data streams

M. Hassani, T. Seidl

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

2 Citations (Scopus)

Abstract

Clustering validation is a crucial part of choosing a clustering algorithm which performs best for an input data. Internal clustering validation is efficient and realistic, whereas external validation requires a ground truth which is not provided in most applications. In this paper, we analyze the properties and performances of eleven internal clustering measures. In particular, as the importance of streaming data grows, we apply these measures to carefully synthesized stream scenarios to reveal how they react to clusterings on evolving data streams. A series of experimental results show that different from the case with static data, the Calinski-Harabasz index performs the best in coping with common aspects and errors of stream clustering.
Original languageEnglish
Title of host publicationTrends and Applications in Knowledge Discovery and Data Mining - PAKDD 2015 Workshops: BigPMA, VLSP, QIMIE, DAEBH, Ho Chi Minh City, Vietnam, May 19-21, 2015. Revised Selected Papers
EditorsXiao-Li Li, Tru Cao, Ee-Peng Lim, Zhi-Hua Zhou, Tu-Bao Ho, David Cheung, Hiroshi Motoda
Pages198-209
Number of pages12
ISBN (Electronic)978-3-319-25660-3
DOIs
Publication statusPublished - 2015
Externally publishedYes
EventQIMIE Workshop (Quality Issues, Measures of Interestingness and Evaluation of Data Mining Models) - Ho Chi Minch City, Viet Nam
Duration: 19 May 201519 May 2015

Publication series

NameLecture Notes in Artificial Intelligence
PublisherSpringer
Volume9441
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceQIMIE Workshop (Quality Issues, Measures of Interestingness and Evaluation of Data Mining Models)
CountryViet Nam
CityHo Chi Minch City
Period19/05/1519/05/15

Fingerprint Dive into the research topics of 'Internal clustering evaluation of data streams'. Together they form a unique fingerprint.

  • Cite this

    Hassani, M., & Seidl, T. (2015). Internal clustering evaluation of data streams. In X-L. Li, T. Cao, E-P. Lim, Z-H. Zhou, T-B. Ho, D. Cheung, & H. Motoda (Eds.), Trends and Applications in Knowledge Discovery and Data Mining - PAKDD 2015 Workshops: BigPMA, VLSP, QIMIE, DAEBH, Ho Chi Minh City, Vietnam, May 19-21, 2015. Revised Selected Papers (pp. 198-209). (Lecture Notes in Artificial Intelligence; Vol. 9441). https://doi.org/10.1007/978-3-319-25660-3_17