Abstract
In this paper, we present, to our knowledge, the first known I/O efficient solutions for computing the k-bisimulation partition of a massive directed graph, and performing maintenance of such a partition upon updates to the underlying graph. Ubiquitous in the theory and application of graph data, bisimulation is a robust notion of node equivalence which intuitively groups together nodes in a graph which share fundamental structural features. k-bisimulation is the standard variant of bisimulation where the topological features of nodes are only considered within a local neighborhood of radius k > 0.
The I/O cost of our partition construction algorithm is bounded by O(k · sort}(|Et|) + k · scan(|Nt|) + sort(|Nt|)), while our maintenance algorithms are bounded by O(k · sort}(|Et|) + k · scan(|Nt|). The space complexity bounds are O(|Nt|+|Et|)$ and O(k · |Nt|+k ·|Et|), resp. Here, |Et| and |Nt| are the number of disk pages occupied by the input graph's edge set and node set, resp., and sort(n) and scan(n) are the cost of sorting and scanning, resp., a file occupying n pages in external memory. Empirical analysis on a variety of massive real-world and synthetic graph datasets shows that our algorithms perform efficiently in practice, scaling gracefully as graphs grow in size.
Original language | English |
---|---|
Title of host publication | Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM'13, Burlingame CA, USA, October 27-November 1, 2013) |
Place of Publication | New York |
Publisher | Association for Computing Machinery, Inc |
Pages | 919-928 |
ISBN (Print) | 978-1-4503-2263-8 |
DOIs | |
Publication status | Published - 2013 |
Event | conference; 22nd ACM International Conference on Information and Knowledge Management; 2013-10-27; 2013-11-01 - Duration: 27 Oct 2013 → 1 Nov 2013 |
Conference
Conference | conference; 22nd ACM International Conference on Information and Knowledge Management; 2013-10-27; 2013-11-01 |
---|---|
Period | 27/10/13 → 1/11/13 |
Other | 22nd ACM International Conference on Information and Knowledge Management |