In this report the same situation will be considered as in Hordijk, Dynamic programrrdng and Markov potential theory [3], viz. a countable state space Markov decision process which can be stopped. Costs have the so-called charge structure and the optimality criterion is the total expected gain. It will be shown, that an optimal strategy, consisting of a memoryless decision rule and a possibly nonmemoryless stopping rule, can be replaced by a strategy consisting of the same decision rule and a stopping rule which is an entry time.
Name | Memorandum COSOR |
---|
Volume | 7501 |
---|
ISSN (Print) | 0926-4493 |
---|