Abstract
Ever more scientists are employing large-scale distributed systems such as grids for their computational work, instead of tightly coupled high-performance computing systems. However, while these distributed systems are more cost-effective, their heterogeneity in terms of hardware, software, and systems administration, and the lack of accurate resource information leads to inefficient scheduling. In addition, and in contrast to the workloads of tightly coupled high-performance computing systems, a large part of the workloads submitted to these distributed systems consists of large sets (bags) of sequential tasks. Therefore, a realistic performance analysis of scheduling bags-of-tasks in large-scale distributed systems is important. Towards this end, we introduce in this paper a realistic workload model for bags-of-tasks, and we explore through trace-based simulations the design space of scheduling bags-of-tasks. Finally, we identify three new scheduling policies that use only inaccurate information when scheduling, and we compare them against known classes of proposed scheduling policies.
Original language | English |
---|---|
Title of host publication | Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC'08, Boston MA, USA, June 23-27, 2008) |
Place of Publication | New York NY |
Publisher | Association for Computing Machinery, Inc |
Pages | 97-108 |
ISBN (Print) | 978-1-59593-997-5 |
DOIs | |
Publication status | Published - 2008 |