We design a dynamic algorithm for dimensioning and stabilizing a cloud provisioning process. We model the process as a semi-open tandem network consisting of a multi-server queue with host servers that respond to new requests of cloud users and an infinite-server queue that contains the active cloud users. The algorithm matches load predictions by jointly setting the number of host servers and the total available capacity. This dual setting is time-dependent and based on the modified offered load (MOL) method. The algorithm is made to perform in the Quality-and-Efficiency-Driven (QED) regime to achieve economies of scale, and is equipped with the feature of repeated requests, which lets initially blocked users retry to get access to the cloud system after a certain delay. Extensive numerical simulations show that the algorithm stabilizes performance at high QoS-levels, even in face of strongly time-varying loads and repeated requests.
|Title of host publication||VALUETOOLS'15 Proceedings of the 9th EAI International Conference on Performance Evaluation Methodologies and Tools, 14-16 December 2015, Berlin, Germany|
|Place of Publication||New York|
|Publisher||Association for Computing Machinery, Inc|
|Publication status||Published - 2016|