In this work, we present an end-to-end network for fast panoptic segmentation. This network, called Fast Panoptic Segmentation Network (FPSNet), does not require computationally costly instance mask predictions or rule-based merging operations. This is achieved by casting the panoptic task into a custom dense pixel-wise classification task, which assigns a class label or an instance id to each pixel. We evaluate FPSNet on the Cityscapes and Pascal VOC datasets, and find that FPSNet is faster than existing panoptic segmentation methods, while achieving better or similar panoptic segmentation performance. On the Cityscapes validation set, we achieve a Panoptic Quality score of 55.1%, at prediction times of 114 milliseconds for images with a resolution of 1024 × 2048 pixels. For lower resolutions of the Cityscapes dataset and for the Pascal VOC dataset, FPSNet achieves prediction times as low as 45 and 28 milliseconds, respectively.
- Semantic scene understanding
- deep learning in robotics and automation
- object detection
- segmentation and categorization