A bounded-size clustering algorithm on fully-dynamic streaming graphs

Jianpeng Zhang, Yulong Pei, George Fletcher, Mykola Pechenizkiy

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)

Abstract

Many contemporary data sources in a variety of domains can naturally be represented as fully-dynamic streaming graphs. How to design an efficient online streaming clustering algorithm on such graphs is of great concern. However, existing clustering approaches are inappropriate for this specific task because: (1) static clustering approaches require expensive computational cost to cluster the graph for each update and (2) the existing streaming clustering neither could fully support insertion/deletion of edges nor take temporal information into account. To tackle these issues, in this work, firstly we propose an appropriate streaming clustering model and design two new core components: streaming reservoir and cluster manager. Then we present an evolution-aware bounded-size clustering algorithm to handle the edge additions/deletions. It requires the clusters to satisfy the maximum cluster-size constraint, and maintains the recency of edges in the temporal sequence and gives high priority to the recent edges in each cluster. The experimental results show that the proposed BSC algorithm outperforms current online algorithms and is capable to keep track of the evolution of graphs. Furthermore, it obtains almost one order of magnitude higher throughput than the state-of-the-art algorithms.

Original languageEnglish
Pages (from-to)1039-1058
Number of pages20
JournalIntelligent Data Analysis
Volume22
Issue number5
DOIs
Publication statusPublished - 1 Sep 2018

Keywords

  • Streaming graph
  • clustering algorithm
  • streaming reservoir
  • cluster manager

Fingerprint Dive into the research topics of 'A bounded-size clustering algorithm on fully-dynamic streaming graphs'. Together they form a unique fingerprint.

  • Cite this