In systems consisting of multiple clusters of processors interconnected by relatively slow network connections such as our Distributed ASCI Supercomputer (DAS), applications may benefit from the availability of processors in multiple clusters. However, the performance of single-application multicluster execution may be degraded due to the slow wide-area links. In addition, scheduling policies for such systems have to deal with more restrictions than schedulers for single clusters in that every component of a job has to fit in separate clusters. In this paper we present a measurement study of the total runtime of two applications, and of the communication time of one of them, both on single clusters and on multicluster systems. In addition, we perform simulations of several multicluster scheduling policies based on our measurement results. Our results show that in many cases, restricted forms of co-allocation in multiclusters have better performance than not allowing co-allocation at all.
|Title of host publication||Job Scheduling Strategies for Parallel Processing (9th International Workshop, JSSPP 2003, Seattle WA, USA, June 24, 2003. Revised Papers)|
|Editors||D. Feitelson, L. Rudolph, U. Schweigelshohn|
|Place of Publication||Berlin|
|Publication status||Published - 2003|
|Name||Lecture Notes in Computer Science|