Hypervolume-based multi-objective reinforcement learning

Kristof Van Moffaert, M.M. Drugan, Ann Nowe

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

30 Citations (Scopus)

Abstract

Indicator-based evolutionary algorithms are amongst the best performing methods for solving multi-objective optimization (MOO) problems. In reinforcement learning (RL), introducing a quality indicator in an algorithm’s decision logic was not attempted before. In this paper, we propose a novel on-line multi-objective reinforcement learning (MORL) algorithm that uses the hypervolume indicator as an action selection strategy. We call this algorithm the hypervolume-based MORL algorithm or HB-MORL and conduct an empirical study of the performance of the algorithm using multiple quality assessment metrics from multi-objective optimization. We compare the hypervolume-based learning algorithm on different environments to two multi-objective algorithms that rely on scalarization techniques, such as the linear scalarization and the weighted Chebyshev function. We conclude that HB-MORL significantly outperforms the linear scalarization method and performs similarly to the Chebyshev algorithm without requiring any user-specified emphasis on particular objectives.
Original languageEnglish
Title of host publicationEvolutionary Multi-Criterion Optimization
Subtitle of host publication7th International Conference, EMO 2013, Sheffield, UK, March 19-22, 2013. Proceedings
EditorsR.C. Purshouse, P.J. Fleming, C.M. Fonseca, S. Greco , J. Shaw
Place of PublicationBerlin
PublisherSpringer
Pages352-366
ISBN (Electronic)978-3-642-37140-0
ISBN (Print)978-3-642-37139-4
DOIs
Publication statusPublished - 2013
Externally publishedYes

Publication series

NameLecture Notes in Computer Sciences
Volume7811

Keywords

  • multi-objective optimization
  • hypervolume unary indicator
  • reinforcement learning

Cite this