A stopping time-based policy iteration algorithm for average reward Markov decision processes

J. Wal, van der

Research output: Book/ReportReportAcademic

34 Downloads (Pure)

Fingerprint Dive into the research topics of 'A stopping time-based policy iteration algorithm for average reward Markov decision processes'. Together they form a unique fingerprint.