Efficient model sharing for scalable collaborative classification

Odysseas Papapetrou, Wolf Siberski, Stefan Siersdorfer

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)

Abstract

We propose a novel collaborative approach for document classification, combining the knowledge of multiple users for improved organization of data such as individual document repositories or emails. To this end, we distribute locally built classification models in a network of participating users, and combine the shared classifiers into more powerful meta models. In order to increase the propagation efficiency, we apply a method for selecting the most discriminative model components and transmitting them to other participants. In our experiments on four large standard collections for text classification we study the resulting tradeoffs between network cost and classification accuracy. The experimental results show that the proposed model propagation has negligible communication costs and substantially outperforms current approaches with respect to efficiency and classification quality.

Original languageEnglish
Pages (from-to)384-398
Number of pages15
JournalPeer-to-Peer Networking and Applications
Volume8
Issue number3
DOIs
Publication statusPublished - 1 May 2015
Externally publishedYes

Fingerprint

Electronic mail
Costs
Classifiers
Communication
Experiments

Keywords

  • Clustering, classification and association rules
  • Data mining
  • Peer-to-peer

Cite this

Papapetrou, Odysseas ; Siberski, Wolf ; Siersdorfer, Stefan. / Efficient model sharing for scalable collaborative classification. In: Peer-to-Peer Networking and Applications. 2015 ; Vol. 8, No. 3. pp. 384-398.
@article{7067e44c5a624b519393cf3c8d797198,
title = "Efficient model sharing for scalable collaborative classification",
abstract = "We propose a novel collaborative approach for document classification, combining the knowledge of multiple users for improved organization of data such as individual document repositories or emails. To this end, we distribute locally built classification models in a network of participating users, and combine the shared classifiers into more powerful meta models. In order to increase the propagation efficiency, we apply a method for selecting the most discriminative model components and transmitting them to other participants. In our experiments on four large standard collections for text classification we study the resulting tradeoffs between network cost and classification accuracy. The experimental results show that the proposed model propagation has negligible communication costs and substantially outperforms current approaches with respect to efficiency and classification quality.",
keywords = "Clustering, classification and association rules, Data mining, Peer-to-peer",
author = "Odysseas Papapetrou and Wolf Siberski and Stefan Siersdorfer",
year = "2015",
month = "5",
day = "1",
doi = "10.1007/s12083-014-0259-1",
language = "English",
volume = "8",
pages = "384--398",
journal = "Peer-to-Peer Networking and Applications",
issn = "1936-6442",
publisher = "Springer",
number = "3",

}

Efficient model sharing for scalable collaborative classification. / Papapetrou, Odysseas; Siberski, Wolf; Siersdorfer, Stefan.

In: Peer-to-Peer Networking and Applications, Vol. 8, No. 3, 01.05.2015, p. 384-398.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Efficient model sharing for scalable collaborative classification

AU - Papapetrou, Odysseas

AU - Siberski, Wolf

AU - Siersdorfer, Stefan

PY - 2015/5/1

Y1 - 2015/5/1

N2 - We propose a novel collaborative approach for document classification, combining the knowledge of multiple users for improved organization of data such as individual document repositories or emails. To this end, we distribute locally built classification models in a network of participating users, and combine the shared classifiers into more powerful meta models. In order to increase the propagation efficiency, we apply a method for selecting the most discriminative model components and transmitting them to other participants. In our experiments on four large standard collections for text classification we study the resulting tradeoffs between network cost and classification accuracy. The experimental results show that the proposed model propagation has negligible communication costs and substantially outperforms current approaches with respect to efficiency and classification quality.

AB - We propose a novel collaborative approach for document classification, combining the knowledge of multiple users for improved organization of data such as individual document repositories or emails. To this end, we distribute locally built classification models in a network of participating users, and combine the shared classifiers into more powerful meta models. In order to increase the propagation efficiency, we apply a method for selecting the most discriminative model components and transmitting them to other participants. In our experiments on four large standard collections for text classification we study the resulting tradeoffs between network cost and classification accuracy. The experimental results show that the proposed model propagation has negligible communication costs and substantially outperforms current approaches with respect to efficiency and classification quality.

KW - Clustering, classification and association rules

KW - Data mining

KW - Peer-to-peer

UR - http://www.scopus.com/inward/record.url?scp=84928707724&partnerID=8YFLogxK

U2 - 10.1007/s12083-014-0259-1

DO - 10.1007/s12083-014-0259-1

M3 - Article

AN - SCOPUS:84928707724

VL - 8

SP - 384

EP - 398

JO - Peer-to-Peer Networking and Applications

JF - Peer-to-Peer Networking and Applications

SN - 1936-6442

IS - 3

ER -