Ensemble-based Deep Reinforcement Learning for Vehicle Routing Problems under Distribution Shift

Yuan Jiang, Zhiguang Cao, Yaoxin Wu, Wen Song, Jie Zhang

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

Samenvatting

While performing favourably on the independent and identically distributed (i.i.d.) instances, most of the existing neural methods for vehicle routing problems (VRPs) struggle to generalize in the presence of a distribution shift. To tackle this issue, we propose an ensemble-based deep reinforcement learning method for VRPs, which learns a group of diverse sub-policies to cope with various instance distributions. In particular, to prevent convergence of the parameters to the same one, we enforce diversity across sub-policies by leveraging Bootstrap with random initialization. Moreover, we also explicitly pursue inequality between sub-policies by exploiting regularization terms during training to further enhance diversity. Experimental results show that our method is able to outperform the state-of-the-art neural baselines on randomly generated instances of various distributions, and also generalizes favourably on the benchmark instances from TSPLib and CVRPLib, which confirmed the effectiveness of the whole method and the respective designs.
Originele taal-2Engels
TitelAdvances in Neural Information Processing Systems 36 (NeurIPS 2023)
RedacteurenA. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, S. Levine
Aantal pagina's14
StatusGepubliceerd - 2023
Evenement37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, Verenigde Staten van Amerika
Duur: 10 dec. 202316 dec. 2023
Congresnummer: 37

Congres

Congres37th Conference on Neural Information Processing Systems, NeurIPS 2023
Verkorte titelNeurIPS 2023
Land/RegioVerenigde Staten van Amerika
StadNew Orleans
Periode10/12/2316/12/23

Vingerafdruk

Duik in de onderzoeksthema's van 'Ensemble-based Deep Reinforcement Learning for Vehicle Routing Problems under Distribution Shift'. Samen vormen ze een unieke vingerafdruk.

Citeer dit