3D Oriented Human Bounding Box Estimation via Monocular Vision and Auto-labeling by Large Pre-trained Neural Models

Semih Orhan (Corresponderende auteur), Elena Torta, Ömür Arslan

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

Samenvatting

To ensure safe and smooth human-robot interaction, autonomous robots operating around people often necessitate 3D human body orientation and bounding box estimation. An orientation of the human is generally defined by the motion around the yaw-axis. However, human-robot interaction tasks may necessitate different orientation definitions depending on the application, for instance an orientation can be defined using shoulder and hip joints. A 3D skeleton map, consisting of multiple 3D body joint coordinates, enables defining an orientation in diverse ways. However, estimating 3D skeleton maps from monocular images is computationally expensive. Existing approaches using large neural network models are impractical for real-time robot operation due to the limited onboard computation and power resources. In this paper, we automatically label the 3D human body orientation from 3D skeleton maps, and present a deep learning method to estimate 3D orientation and bounding box. We achieve this by leveraging a larger neural perception model to automatically generate an annotated training dataset, using a functional mapping from 3D skeleton joint coordinates to the defined orientation and bounding boxes. Experimental results demonstrate that our perception model estimates 3D human body orientation with an average error of 10.94
around the yaw-axis and the 3D bounding box with errors of 19.15% in width, 8.71% in length, and 4.71% in height compared to the annotator’s output, while doubling the FPS rate.
Originele taal-2Engels
TitelHuman-Friendly Robotics 2024
SubtitelHFR: 17th International Workshop on Human-Friendly Robotics
RedacteurenAntonio Paolillo, Allessandro Giusti, Gabriele Abbate
Plaats van productieCham
UitgeverijSpringer
Pagina's182-196
Aantal pagina's15
ISBN van elektronische versie978-3-031-81688-8
ISBN van geprinte versie978-3-031-81687-1, 978-3-031-81793-9
DOI's
StatusGepubliceerd - 26 feb. 2025
Evenement17th International Workshop on Human-Friendly Robotics, HFR 2024 - Lugano, Zwitserland
Duur: 30 sep. 20241 okt. 2024

Publicatie series

NaamSpringer Proceedings in Advanced Robotics (SPAR)
Volume35
ISSN van geprinte versie2511-1256
ISSN van elektronische versie2511-1264

Congres

Congres17th International Workshop on Human-Friendly Robotics, HFR 2024
Verkorte titelHFR 2024
Land/RegioZwitserland
StadLugano
Periode30/09/241/10/24

Vingerafdruk

Duik in de onderzoeksthema's van '3D Oriented Human Bounding Box Estimation via Monocular Vision and Auto-labeling by Large Pre-trained Neural Models'. Samen vormen ze een unieke vingerafdruk.

Citeer dit