We develop an online gradient algorithm for optimizing the performance of product-form networks through online adjustment of control parameters. The use of standard algorithms for finding optimal parameter settings is hampered by the prohibitive computational burden of calculating the gradient in terms of the stationary probabilities. The proposed approach instead relies on measuring empirical frequencies of the various states through simulation or online operation so as to obtain estimates for the gradient. Besides the reduction in computational effort, a further benefit of the online operation lies in the natural adaptation to slow variations in ambient parameters as commonly occurring in dynamic environments. On the downside, the measurements result in inherently noisy and biased estimates. We exploit mixing time results in order to overcome the impact of the bias and establish sufficient conditions for convergence to a globally optimal solution.
Keywords: Gradient algorithm, Markov processes, mixing times, online performance optimization, product-form networks, stochastic approximation, dynamic control.
|Status||Gepubliceerd - 2012|