Even though with few exceptions, grid workloads are dominated by single-node jobs, not all of these jobs are necessarily independent or unrelated. For instance, sets of jobs may be grouped because they are submitted by users in batches, e.g., to perform parameter sweeps. However, there is no reported data to confirm the presence and structure of these groupings, despite the large potential impact of such information. To address this lack of information, in this work we present a first investigation into the characteristics of groups of jobs present in grid workloads. First, we define three types of job groupings: batch, continued, and bursty submissions. Then, we analyze the characteristics of these groupings for three long-term traces from currently deployed grid environments. Notably, our results show that the various groupings are responsible for up to 96% of the total CPU time consumption. Finally, we present insights into the performance of real grids in dealing with grouped jobs.
|Title of host publication||Euro-Par 2007 - Parallel Processing (13th International Euro-Par Conference, Rennes, France, August 28-31, 2007. Proceedings)|
|Editors||A.M. Kermarrec, L. Bougé, T. Priol|
|Place of Publication||Berlin|
|Publication status||Published - 2007|
|Name||Lecture Notes in Computer Science|