Internal clustering evaluation of data streams

M. Hassani, T. Seidl

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

2 Citations (Scopus)


Clustering validation is a crucial part of choosing a clustering algorithm which performs best for an input data. Internal clustering validation is efficient and realistic, whereas external validation requires a ground truth which is not provided in most applications. In this paper, we analyze the properties and performances of eleven internal clustering measures. In particular, as the importance of streaming data grows, we apply these measures to carefully synthesized stream scenarios to reveal how they react to clusterings on evolving data streams. A series of experimental results show that different from the case with static data, the Calinski-Harabasz index performs the best in coping with common aspects and errors of stream clustering.
Original languageEnglish
Title of host publicationTrends and Applications in Knowledge Discovery and Data Mining - PAKDD 2015 Workshops: BigPMA, VLSP, QIMIE, DAEBH, Ho Chi Minh City, Vietnam, May 19-21, 2015. Revised Selected Papers
EditorsXiao-Li Li, Tru Cao, Ee-Peng Lim, Zhi-Hua Zhou, Tu-Bao Ho, David Cheung, Hiroshi Motoda
Number of pages12
ISBN (Electronic)978-3-319-25660-3
Publication statusPublished - 2015
Externally publishedYes
EventQIMIE Workshop (Quality Issues, Measures of Interestingness and Evaluation of Data Mining Models) - Ho Chi Minch City, Viet Nam
Duration: 19 May 201519 May 2015

Publication series

NameLecture Notes in Artificial Intelligence
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


ConferenceQIMIE Workshop (Quality Issues, Measures of Interestingness and Evaluation of Data Mining Models)
Country/TerritoryViet Nam
CityHo Chi Minch City


Dive into the research topics of 'Internal clustering evaluation of data streams'. Together they form a unique fingerprint.

Cite this