Abstract
For semi-Markov decision processes with discounted rewards we derive the well known results regarding the structure of optimal strategies (nonrandomized, stationary Markov strategies) and the standard algorithms (linear programming, policy iteration). Our analysis is completely based on a primal linear programming formulation of the problem.
Original language | English |
---|---|
Pages (from-to) | 1-7 |
Number of pages | 7 |
Journal | Statistica Neerlandica |
Volume | 29 |
Issue number | 1 |
DOIs | |
Publication status | Published - 1975 |