This paper considers the Markov decision process with finite state and action spaces, when the discountfactor tends to 1. Miller and Veinott have shown the existence of n-discount optimal policies and Veinott has given an algorithm to determine one. In this paper we use the stopping times as introduced by Wessels to generate a set of modified policy iteration algorithms for the determination of an n-discount optimal strategy.
|Place of Publication||Eindhoven|
|Publisher||Technische Hogeschool Eindhoven|
|Number of pages||17|
|Publication status||Published - 1978|