Massive graphs are becoming increasingly common in a variety of domains such as social networks and web analytics. One approach to overcoming the challenges of size is to sample the graph, and perform analytics on the smaller graph. However, to be useful, the sample must maintain the properties of interest in the original graph. In this paper, we analyze the quality of five representative sampling algorithms in how well they preserve graph structure, the bisimulation structure of graphs in particular. As part of this study, we also develop a new scalable algorithm for computing bisimulation partitions of massive graphs. We empirically demonstrate the superior performance of our new algorithm in both sequential and distributed settings.
|Title of host publication||SAC '16 Proceedings of the 31st Annual ACM Symposium on Applied Computing, 4-8 April 2016, Pisa, Italy|
|Place of Publication||New York|
|Publisher||Association for Computing Machinery, Inc|
|Publication status||Published - 2016|
van Heeswijk, W., Fletcher, G. H. L., & Pechenizkiy, M. (2016). On structure preserving sampling and approximate partitioning of graphs. In SAC '16 Proceedings of the 31st Annual ACM Symposium on Applied Computing, 4-8 April 2016, Pisa, Italy (pp. 875-882). Association for Computing Machinery, Inc. https://doi.org/10.1145/2851613.2851650