For Markov decision processes with countable state space and nonnegative immediate rewards Ornstein proved the existence of a stationary strategy f which is uniformly nearly optimal in the following multiplicative sense v(f) = (1 - e) v* . Strauch proved that if the immediate rewards are nonpositive and the action space is finite then a uniformly optimal stationary strategy exists. This paper connects these partial results and proves the following theorem for Markov decision processes with countable state space and arbitrary action space: if in each state where the value is nonpositive a conserving action exists then there is a stationary strategy f satisfying v(f) = V* - eu* where u* is the value of the problem if only the positive rewards are counted.
Name | Memorandum COSOR |
---|
Volume | 8111 |
---|
ISSN (Print) | 0926-4493 |
---|