Discounted Markov games : generalized policy iteration method

J. Wal, van der

Research output: Contribution to journalArticleAcademicpeer-review

19 Citations (Scopus)
1 Downloads (Pure)

Abstract

In this paper, we consider two-person zero-sum discounted Markov games with finite state and action spaces. We show that the Newton-Raphson or policy iteration method as presented by Pollats-chek and Avi-Itzhak does not necessarily converge, contradicting a proof of Rao, Chandrasekaran, and Nair. Moreover, a set of successive approximation algorithms is presented of which Shapley''s method and a total-expected-rewards version of Hoffman and Karp''s method are the extreme elements.
Original languageEnglish
Pages (from-to)125-138
Number of pages14
JournalJournal of Optimization Theory and Applications
Volume25
Issue number1
DOIs
Publication statusPublished - 1978

Fingerprint

Dive into the research topics of 'Discounted Markov games : generalized policy iteration method'. Together they form a unique fingerprint.

Cite this