We consider a system of N servers inter-connected by some underlying graph topology GN. Tasks with unit-mean exponential processing times arrive at the various servers as independent Poisson processes of rate λ. Each incoming task is irrevocably assigned to whichever server has the smallest number of tasks among the one where it appears and its neighbors in GN. The above model arises in the context of load balancing in large-scale cloud networks and data centers, and has been extensively investigated in the case GN is a clique. Since the servers are exchangeable in that case, mean-field limits apply, and in particular it has been proved that for any λ < 1, the fraction of servers with two or more tasks vanishes in the limit as N → ∞. For an arbitrary graph GN, mean-field techniques break down, complicating the analysis, and the queue length process tends to be worse than for a clique. Accordingly, a graph GN is said to be N-optimal or √N-optimal when the queue length process on GN is equivalent to that on a clique on an N-scale or √N-scale, respectively. We prove that if GN is an Erdöo s-Rényi random graph with average degree d(N), then with high probability it is N-optimal and √N-optimal if d(N) → ∞ and d(N)/√Nlog(N)) → ∞ as N → ∞, respectively. This demonstrates that optimality can be maintained at N-scale and √N-scale while reducing the number of connections by nearly a factor N and √N/log(N) compared to a clique, provided the topology is suitably random. It is further shown that if GN contains Θ(N) bounded-degree nodes, then it cannot be N-optimal. In addition, we establish that an arbitrary graph GN is N-optimal when its minimum degree is N - o(N), and may not be N-optimal even when its minimum degree is cN + o(N) for any 0 < c < 1/2. Simulation experiments are conducted for various scenarios to corroborate the asymptotic results.
|Number of pages||29|
|Journal||Proceedings of the ACM on Measurement and Analysis of Computing Systems|
|Publication status||Published - Apr 2018|
- asymptotic optimality, cloud networking, data centers, delay performance, load balancing, load balancing on graphs, power-of-d scheme, scaling limits