On-line Order Batching for Robot-based Order Picking Systems using Deep Reinforcement Learning

Research output: Contribution to conferenceAbstractAcademic


Recent advancements in robotics and automation have enabled warehouses in the e-commerce era to adopt new ways to stay competitive under highly volatile customer demands with shorter deadlines. Uniquely, we consider an autonomous robot-based order picking system that fulfils orders from a multi-deep gravity flow rack in a dynamic environment, wherein orders arrive continuously. For such a system, we make two decisions: (i) when to pick orders and (ii) which orders compose a batch. We study the online order batching problem with an objective to minimize the weighted earliness and tardiness. While earliness results in increased inventory holding costs, deterioration of perishable goods, or opportunity costs, tardiness is undesired with regard to customer satisfaction. Subsequently, we formulate a Semi-Markov decision process to represent the problem that allows us to create a deep reinforcement learning (DRL) agent. The agent learns a policy by interacting with the environment and solves the problem with Proximal Policy Optimization algorithm. We use several benchmark heuristics to evaluate the performance of the DRL agent. The agent is able to create a policy that produces feasible solutions superior to the benchmark heuristics in most of the tested cases. We demonstrate that the learning agent shows potential performance under fluctuating order environment, which implies that it is effective and efficient, particularly in the online retailing of fast-moving consumer goods.
Original languageEnglish
Publication statusPublished - 5 Jul 2022
EventEURO 2022 - AALTO University, Espoo, Finland
Duration: 3 Jul 20226 Jul 2022


ConferenceEURO 2022
Internet address


  • Warehousing
  • E-commerce
  • Deep Reinforcement Learning


Dive into the research topics of 'On-line Order Batching for Robot-based Order Picking Systems using Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this