In this paper, we present, to our knowledge, the fi??rst known I/O e??cient solutions for computing the
k-bisimulation partition of a massive graph, and performing maintenance of such a partition upon updates to the underlying graph. Bisimulation is a robust notion of node equivalence which is ubiquitous in the theory and application of graph data. It defi??nes an intuitive notion of nodes in a graph sharing fundamental structural features. We consider in particular k-bisimulation, which is the standard variant of bisimulation where the topological features of nodes are only considered within a local neighborhood of radius k > 0.
The I/O cost of our partition construction algorithm is bounded by O(k.sort(Et) + k.scan(Nt) + sort(jNtj)),
while our maintenance algorithms are bounded by O(k.sort(Et) + k.sort(Nt)). Here, Et and Nt are the
number of disk pages occupied by the input graph's edge set and node set, resp., and sort(n) and scan(n) are the cost of sorting and scanning, resp., a ??le occupying n pages in external memory. Empirical analysis on a variety of massive real-world and synthetic graph datasets shows that
our algorithms not only perform e??ciently, but also scale gracefully as graphs grow in size.
|Number of pages||15|
|Publication status||Published - 2012|