The generation of successive approximation methods for Markov decision processes by using stopping times

J.A.E.E. van Nunen, J. Wessels

Research output: Book/ReportReportAcademic

39 Downloads (Pure)

Abstract

In this paper we will consider several variants of the standard successive approximation technique for Markov decision processes. It will be shown how these variants can be generated by stopping times. Furthermore it will be demonstrated how this class of techniques can be extended to a class of value oriented techniques. This latter class contains as extreme elements several variants of Howard's policy iteration method. For all methods presented extrapolations are given in the form of MacQueen's upper and lower bounds.
Original languageEnglish
Place of PublicationEindhoven
PublisherTechnische Hogeschool Eindhoven
Number of pages13
Publication statusPublished - 1976

Publication series

NameMemorandum COSOR
Volume7622
ISSN (Print)0926-4493

Fingerprint

Stopping Time
Successive Approximation
Markov Decision Process
Approximation Methods
Policy Iteration
Iteration Method
Extrapolation
Upper and Lower Bounds
Extremes
Class

Cite this

van Nunen, J. A. E. E., & Wessels, J. (1976). The generation of successive approximation methods for Markov decision processes by using stopping times. (Memorandum COSOR; Vol. 7622). Eindhoven: Technische Hogeschool Eindhoven.
van Nunen, J.A.E.E. ; Wessels, J. / The generation of successive approximation methods for Markov decision processes by using stopping times. Eindhoven : Technische Hogeschool Eindhoven, 1976. 13 p. (Memorandum COSOR).
@book{c42d3c5b018b47ccbc11ba0290b38528,
title = "The generation of successive approximation methods for Markov decision processes by using stopping times",
abstract = "In this paper we will consider several variants of the standard successive approximation technique for Markov decision processes. It will be shown how these variants can be generated by stopping times. Furthermore it will be demonstrated how this class of techniques can be extended to a class of value oriented techniques. This latter class contains as extreme elements several variants of Howard's policy iteration method. For all methods presented extrapolations are given in the form of MacQueen's upper and lower bounds.",
author = "{van Nunen}, J.A.E.E. and J. Wessels",
year = "1976",
language = "English",
series = "Memorandum COSOR",
publisher = "Technische Hogeschool Eindhoven",

}

van Nunen, JAEE & Wessels, J 1976, The generation of successive approximation methods for Markov decision processes by using stopping times. Memorandum COSOR, vol. 7622, Technische Hogeschool Eindhoven, Eindhoven.

The generation of successive approximation methods for Markov decision processes by using stopping times. / van Nunen, J.A.E.E.; Wessels, J.

Eindhoven : Technische Hogeschool Eindhoven, 1976. 13 p. (Memorandum COSOR; Vol. 7622).

Research output: Book/ReportReportAcademic

TY - BOOK

T1 - The generation of successive approximation methods for Markov decision processes by using stopping times

AU - van Nunen, J.A.E.E.

AU - Wessels, J.

PY - 1976

Y1 - 1976

N2 - In this paper we will consider several variants of the standard successive approximation technique for Markov decision processes. It will be shown how these variants can be generated by stopping times. Furthermore it will be demonstrated how this class of techniques can be extended to a class of value oriented techniques. This latter class contains as extreme elements several variants of Howard's policy iteration method. For all methods presented extrapolations are given in the form of MacQueen's upper and lower bounds.

AB - In this paper we will consider several variants of the standard successive approximation technique for Markov decision processes. It will be shown how these variants can be generated by stopping times. Furthermore it will be demonstrated how this class of techniques can be extended to a class of value oriented techniques. This latter class contains as extreme elements several variants of Howard's policy iteration method. For all methods presented extrapolations are given in the form of MacQueen's upper and lower bounds.

M3 - Report

T3 - Memorandum COSOR

BT - The generation of successive approximation methods for Markov decision processes by using stopping times

PB - Technische Hogeschool Eindhoven

CY - Eindhoven

ER -

van Nunen JAEE, Wessels J. The generation of successive approximation methods for Markov decision processes by using stopping times. Eindhoven: Technische Hogeschool Eindhoven, 1976. 13 p. (Memorandum COSOR).