In this paper we consider some problems and results in the field of Markov decision processes with an incompletely known transition law. We consider the discounted total return under the Bayes criterion. We discuss easy-to-handle strategies which are optimal under some conditions for the average return case and also for some special models in the discounted total return case. Further we provide approximation methods to compute the optimal value.
|ISSN van geprinte versie||0926-4493|