A bounded-size clustering algorithm on fully-dynamic streaming graphs

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

1 Citaat (Scopus)

Uittreksel

Many contemporary data sources in a variety of domains can naturally be represented as fully-dynamic streaming graphs. How to design an efficient online streaming clustering algorithm on such graphs is of great concern. However, existing clustering approaches are inappropriate for this specific task because: (1) static clustering approaches require expensive computational cost to cluster the graph for each update and (2) the existing streaming clustering neither could fully support insertion/deletion of edges nor take temporal information into account. To tackle these issues, in this work, firstly we propose an appropriate streaming clustering model and design two new core components: streaming reservoir and cluster manager. Then we present an evolution-aware bounded-size clustering algorithm to handle the edge additions/deletions. It requires the clusters to satisfy the maximum cluster-size constraint, and maintains the recency of edges in the temporal sequence and gives high priority to the recent edges in each cluster. The experimental results show that the proposed BSC algorithm outperforms current online algorithms and is capable to keep track of the evolution of graphs. Furthermore, it obtains almost one order of magnitude higher throughput than the state-of-the-art algorithms.

TaalEngels
Pagina's1039-1058
Aantal pagina's20
TijdschriftIntelligent Data Analysis
Volume22
Nummer van het tijdschrift5
DOI's
StatusGepubliceerd - 1 sep 2018

Vingerafdruk

Streaming
Clustering algorithms
Clustering Algorithm
Clustering
Graph in graph theory
Deletion
Managers
Throughput
Online Algorithms
High Throughput
Insertion
Computational Cost
Update
Costs
Experimental Results
Design

Trefwoorden

    Citeer dit

    @article{55b2c816a8094199ab9cff9532f886be,
    title = "A bounded-size clustering algorithm on fully-dynamic streaming graphs",
    abstract = "Many contemporary data sources in a variety of domains can naturally be represented as fully-dynamic streaming graphs. How to design an efficient online streaming clustering algorithm on such graphs is of great concern. However, existing clustering approaches are inappropriate for this specific task because: (1) static clustering approaches require expensive computational cost to cluster the graph for each update and (2) the existing streaming clustering neither could fully support insertion/deletion of edges nor take temporal information into account. To tackle these issues, in this work, firstly we propose an appropriate streaming clustering model and design two new core components: streaming reservoir and cluster manager. Then we present an evolution-aware bounded-size clustering algorithm to handle the edge additions/deletions. It requires the clusters to satisfy the maximum cluster-size constraint, and maintains the recency of edges in the temporal sequence and gives high priority to the recent edges in each cluster. The experimental results show that the proposed BSC algorithm outperforms current online algorithms and is capable to keep track of the evolution of graphs. Furthermore, it obtains almost one order of magnitude higher throughput than the state-of-the-art algorithms.",
    keywords = "Streaming graph, clustering algorithm, streaming reservoir, cluster manager",
    author = "Jianpeng Zhang and Yulong Pei and George Fletcher and Mykola Pechenizkiy",
    year = "2018",
    month = "9",
    day = "1",
    doi = "10.3233/IDA-173573",
    language = "English",
    volume = "22",
    pages = "1039--1058",
    journal = "Intelligent Data Analysis",
    issn = "1088-467X",
    publisher = "IOS Press",
    number = "5",

    }

    A bounded-size clustering algorithm on fully-dynamic streaming graphs. / Zhang, Jianpeng; Pei, Yulong; Fletcher, George; Pechenizkiy, Mykola.

    In: Intelligent Data Analysis, Vol. 22, Nr. 5, 01.09.2018, blz. 1039-1058.

    Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

    TY - JOUR

    T1 - A bounded-size clustering algorithm on fully-dynamic streaming graphs

    AU - Zhang,Jianpeng

    AU - Pei,Yulong

    AU - Fletcher,George

    AU - Pechenizkiy,Mykola

    PY - 2018/9/1

    Y1 - 2018/9/1

    N2 - Many contemporary data sources in a variety of domains can naturally be represented as fully-dynamic streaming graphs. How to design an efficient online streaming clustering algorithm on such graphs is of great concern. However, existing clustering approaches are inappropriate for this specific task because: (1) static clustering approaches require expensive computational cost to cluster the graph for each update and (2) the existing streaming clustering neither could fully support insertion/deletion of edges nor take temporal information into account. To tackle these issues, in this work, firstly we propose an appropriate streaming clustering model and design two new core components: streaming reservoir and cluster manager. Then we present an evolution-aware bounded-size clustering algorithm to handle the edge additions/deletions. It requires the clusters to satisfy the maximum cluster-size constraint, and maintains the recency of edges in the temporal sequence and gives high priority to the recent edges in each cluster. The experimental results show that the proposed BSC algorithm outperforms current online algorithms and is capable to keep track of the evolution of graphs. Furthermore, it obtains almost one order of magnitude higher throughput than the state-of-the-art algorithms.

    AB - Many contemporary data sources in a variety of domains can naturally be represented as fully-dynamic streaming graphs. How to design an efficient online streaming clustering algorithm on such graphs is of great concern. However, existing clustering approaches are inappropriate for this specific task because: (1) static clustering approaches require expensive computational cost to cluster the graph for each update and (2) the existing streaming clustering neither could fully support insertion/deletion of edges nor take temporal information into account. To tackle these issues, in this work, firstly we propose an appropriate streaming clustering model and design two new core components: streaming reservoir and cluster manager. Then we present an evolution-aware bounded-size clustering algorithm to handle the edge additions/deletions. It requires the clusters to satisfy the maximum cluster-size constraint, and maintains the recency of edges in the temporal sequence and gives high priority to the recent edges in each cluster. The experimental results show that the proposed BSC algorithm outperforms current online algorithms and is capable to keep track of the evolution of graphs. Furthermore, it obtains almost one order of magnitude higher throughput than the state-of-the-art algorithms.

    KW - Streaming graph

    KW - clustering algorithm

    KW - streaming reservoir

    KW - cluster manager

    UR - http://www.scopus.com/inward/record.url?scp=85054193455&partnerID=8YFLogxK

    U2 - 10.3233/IDA-173573

    DO - 10.3233/IDA-173573

    M3 - Article

    VL - 22

    SP - 1039

    EP - 1058

    JO - Intelligent Data Analysis

    T2 - Intelligent Data Analysis

    JF - Intelligent Data Analysis

    SN - 1088-467X

    IS - 5

    ER -