A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1

J. Wal, van der

Research output: Book/ReportReportAcademic

18 Downloads (Pure)

Abstract

This paper considers the Markov decision process with finite state and action spaces, when the discountfactor tends to 1. Miller and Veinott have shown the existence of n-discount optimal policies and Veinott has given an algorithm to determine one. In this paper we use the stopping times as introduced by Wessels to generate a set of modified policy iteration algorithms for the determination of an n-discount optimal strategy.
Original languageEnglish
Place of PublicationEindhoven
PublisherTechnische Hogeschool Eindhoven
Number of pages17
Publication statusPublished - 1978

Publication series

NameMemorandum COSOR
Volume7824
ISSN (Print)0926-4493

Fingerprint

Policy Iteration
Stopping Time
Discount
Markov Decision Process
Optimal Policy
Optimal Strategy
Tend

Cite this

Wal, van der, J. (1978). A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1. (Memorandum COSOR; Vol. 7824). Eindhoven: Technische Hogeschool Eindhoven.
Wal, van der, J. / A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1. Eindhoven : Technische Hogeschool Eindhoven, 1978. 17 p. (Memorandum COSOR).
@book{e956d36390b74e9e8240391e2712a5a3,
title = "A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1",
abstract = "This paper considers the Markov decision process with finite state and action spaces, when the discountfactor tends to 1. Miller and Veinott have shown the existence of n-discount optimal policies and Veinott has given an algorithm to determine one. In this paper we use the stopping times as introduced by Wessels to generate a set of modified policy iteration algorithms for the determination of an n-discount optimal strategy.",
author = "{Wal, van der}, J.",
year = "1978",
language = "English",
series = "Memorandum COSOR",
publisher = "Technische Hogeschool Eindhoven",

}

Wal, van der, J 1978, A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1. Memorandum COSOR, vol. 7824, Technische Hogeschool Eindhoven, Eindhoven.

A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1. / Wal, van der, J.

Eindhoven : Technische Hogeschool Eindhoven, 1978. 17 p. (Memorandum COSOR; Vol. 7824).

Research output: Book/ReportReportAcademic

TY - BOOK

T1 - A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1

AU - Wal, van der, J.

PY - 1978

Y1 - 1978

N2 - This paper considers the Markov decision process with finite state and action spaces, when the discountfactor tends to 1. Miller and Veinott have shown the existence of n-discount optimal policies and Veinott has given an algorithm to determine one. In this paper we use the stopping times as introduced by Wessels to generate a set of modified policy iteration algorithms for the determination of an n-discount optimal strategy.

AB - This paper considers the Markov decision process with finite state and action spaces, when the discountfactor tends to 1. Miller and Veinott have shown the existence of n-discount optimal policies and Veinott has given an algorithm to determine one. In this paper we use the stopping times as introduced by Wessels to generate a set of modified policy iteration algorithms for the determination of an n-discount optimal strategy.

M3 - Report

T3 - Memorandum COSOR

BT - A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1

PB - Technische Hogeschool Eindhoven

CY - Eindhoven

ER -

Wal, van der J. A stopping time-based policy iteration algorithm for Markov decision processes with discountfactor tending to 1. Eindhoven: Technische Hogeschool Eindhoven, 1978. 17 p. (Memorandum COSOR).