The action elimination algorithm for Markov decision processes

N.A.J. Hastings, J.A.E.E. van Nunen

Research output: Book/ReportReportAcademic

91 Downloads (Pure)

Abstract

An efficient algorithm for solving Markov decision problems is proposed. The value iteration method of dynamic programming is used in conjunction with a test for nonoptimal actions. The algorithm applies to problems with undiscounted or discounted returns with infinite or finite planning horizon. In the finite horizon case the discount factor may exceed unity. The nonoptimality test, which is an extension of Hastings test for the undiscounted reward case, is used to identify actions which cannot be optimal at the current stage. As convergence proceeds the proportion of such actions increases producing major computational savings. For problems with discount factor less than one the test is shown to be tighter than that of MacQueen.
Original languageEnglish
Place of PublicationEindhoven
PublisherTechnische Hogeschool Eindhoven
Number of pages10
Publication statusPublished - 1976

Publication series

NameMemorandum COSOR
Volume7620
ISSN (Print)0926-4493

Fingerprint

Dive into the research topics of 'The action elimination algorithm for Markov decision processes'. Together they form a unique fingerprint.

Cite this