We consider a system of parallel queues with dedicated arrival streams. At each decision epoch a decision-maker can move customers from one queue to another. The cost for moving customers consists of a fixed cost and a linear, variable cost dependent on the number of customers moved. There are also linear holding costs that may depend on the queue in which customers are stored. Under very mild assumptions, we develop stability (and instability) conditions for this system via a fluid model. Under the assumption of stability, we consider minimizing the long-run average cost. In the case of two-servers the optimal control policy is shown to prefer to store customers in the lowest cost queue. When the inter-arrival and service times are assumed to be exponential, we use a Markov decision process formulation to show that for a fixed number of customers in the system, there exists a level S such that whenever customers are moved from the high cost queue to the low cost queue, the number of customers moved brings the number of customers in the low cost queue to S. These results lead to the development of a heuristic for the model with more than two servers.
Down, D. G., & Lewis, M. E. (2006). Dynamic load balancing in parallel queueing systems : stability and optimal control. European Journal of Operational Research, 168(2), 509-519. https://doi.org/10.1016/j.ejor.2004.04.041