The Dynamic Traveling Salesman Problem with Time-Dependent and Stochastic travel times: A Deep Reinforcement Learning Approach

Dawei Chen (Corresponding author), Christina Imdahl, David Lai, Tom van Woensel

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

14 Downloads (Pure)

Samenvatting

We propose a novel approach using deep reinforcement learning to tackle the Dynamic Traveling Salesman Problem with Time-Dependent and Stochastic travel times (DTSP-TDS). The main goal is to dynamically plan the route with the shortest tour duration that visits all customers while considering the uncertainties and time-dependence of travel times. We employ a reinforcement learning approach to dynamically address the stochastic travel times to observe changing states and make decisions accordingly. Our reinforcement learning approach incorporates a Dynamic Graph Temporal Attention model with multi-head attention to dynamically extract information about stochastic travel times. Numerical studies with varying amounts of customers and time intervals are conducted to test its performance. Our proposed approach outperforms other benchmarks regarding solution quality and solving time, including the rolling horizon heuristics and other existing reinforcement learning approaches. In addition, we demonstrate the generalization capability of our approach in solving the various DTSP-TDS in various scenarios.
Originele taal-2Engels
Artikelnummer105022
Aantal pagina's20
TijdschriftTransportation Research Part C: Emerging Technologies
Volume172
Vroegere onlinedatum12 feb. 2025
DOI's
StatusGepubliceerd - mrt. 2025

Financiering

This work has made use of resources and expertise provided by SURF Experimental Technologies Platform, which is part of the SURF cooperative in the Netherlands (No.EINF-9576). Dawei Chen acknowledges financial support from the China Scholarship Council (No. 202106370026).

Vingerafdruk

Duik in de onderzoeksthema's van 'The Dynamic Traveling Salesman Problem with Time-Dependent and Stochastic travel times: A Deep Reinforcement Learning Approach'. Samen vormen ze een unieke vingerafdruk.

Citeer dit