Abstract
The estimation of object orientation from RGB images is a core component in many modern computer vision pipelines. Traditional techniques mostly predict a single orientation per image, learning a one-to-one mapping between images and rotations. However, when objects exhibit rotational symmetries, they can appear identical from multiple viewpoints. This induces ambiguity in the estimation problem, making images map to rotations in a one-to-many fashion. In this paper, we explore several ways of addressing this problem. In doing so, we specifically consider algorithms that can map an image to a range of multiple rotation estimates, accounting for symmetry-induced ambiguity. Our contributions are threefold. Firstly, we create a data set with annotated symmetry information that covers symmetries induced through self-occlusion. Secondly, we compare and evaluate various learning strategies for multiple-hypothesis prediction models applied to orientation estimation. Finally, we propose to model orientation estimation as a binary classification problem. To this end, based on existing work from the field of shape reconstruction, we design a neural network that can be sampled to reconstruct the full range of ambiguous rotations for a given image. Quantitative evaluation on our annotated data set demonstrates its performance and motivates our design choices.
| Original language | English |
|---|---|
| Article number | 40 |
| Number of pages | 17 |
| Journal | Machine Vision and Applications |
| Volume | 36 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - Mar 2025 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025.
Keywords
- Ambiguity
- Computer vision
- Machine learning
- Orientation estimation
- Pose estimation
- Symmetry
- Uncertainty