The generation of successive approximation methods for Markov decision processes by using stopping times

J.A.E.E. van Nunen, J. Wessels

    Onderzoeksoutput: Boek/rapportRapportAcademic

    45 Downloads (Pure)

    Samenvatting

    In this paper we will consider several variants of the standard successive approximation technique for Markov decision processes. It will be shown how these variants can be generated by stopping times. Furthermore it will be demonstrated how this class of techniques can be extended to a class of value oriented techniques. This latter class contains as extreme elements several variants of Howard's policy iteration method. For all methods presented extrapolations are given in the form of MacQueen's upper and lower bounds.
    Originele taal-2Engels
    Plaats van productieEindhoven
    UitgeverijTechnische Hogeschool Eindhoven
    Aantal pagina's13
    StatusGepubliceerd - 1976

    Publicatie series

    NaamMemorandum COSOR
    Volume7622
    ISSN van geprinte versie0926-4493

    Vingerafdruk Duik in de onderzoeksthema's van 'The generation of successive approximation methods for Markov decision processes by using stopping times'. Samen vormen ze een unieke vingerafdruk.

    Citeer dit