Recently, Federgruen and Schweitzer [3] proved that in undiscounted Markov decision problems the value iteration method for finding maximal gain policies converges geometrically fast, whenever convergence occurs. This result was obtained without any restriction on either the periodicity or chain structure of the problem. In this paper we establish the same result once again; the proof however, seems essentially simpler and, moreover, yields an upperbound for the convergence rate.
Name | Memorandum COSOR |
---|
Volume | 8008 |
---|
ISSN (Print) | 0926-4493 |
---|