On boosting semantic street scene segmentation with weak supervision

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

Training convolutional networks for semantic segmentation requires per-pixel ground truth labels, which are very time consuming and hence costly to obtain. Therefore, in this work, we research and develop a hierarchical deep network architecture and the corresponding loss for semantic segmentation that can be trained from weak supervision, such as bounding boxes or image level labels, as well as from strong per-pixel supervision. We demonstrate that the hierarchical structure and the simultaneous training on strong (per-pixel) and weak (bounding boxes) labels, even from separate datasets, constantly increases the performance against per-pixel only training. Moreover, we explore the more challenging case of adding weak image-level labels. We collect street scene images and weak labels from the immense Open Images dataset to generate the OpenScapes dataset, and we use this novel dataset to increase segmentation performance on two established per-pixel labeled datasets, Cityscapes and Vistas. We report performance gains up to +13.2% mIoU on crucial street scene classes, and inference speed of 20 fps on a Titan V GPU for Cityscapes at 512 x 1024 resolution. Our network and OpenScapes dataset are shared with the research community.
Original languageEnglish
Title of host publication2019 IEEE Intelligent Vehicles Symposium, IV 2019
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers
Pages1334-1339
Number of pages6
ISBN (Electronic)978-1-7281-0560-4
DOIs
Publication statusPublished - 2019
Event2019 IEEE Intelligent Vehicles Symposium (IV) - Paris, France
Duration: 9 Jun 201912 Jun 2019

Conference

Conference2019 IEEE Intelligent Vehicles Symposium (IV)
CountryFrance
CityParis
Period9/06/1912/06/19

Fingerprint

Labels
Pixels
Semantics
Network architecture

Cite this

Meletis, P., & Dubbelman, G. (2019). On boosting semantic street scene segmentation with weak supervision. In 2019 IEEE Intelligent Vehicles Symposium, IV 2019 (pp. 1334-1339). Piscataway: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/IVS.2019.8814217
Meletis, Panagiotis ; Dubbelman, Gijs. / On boosting semantic street scene segmentation with weak supervision. 2019 IEEE Intelligent Vehicles Symposium, IV 2019. Piscataway : Institute of Electrical and Electronics Engineers, 2019. pp. 1334-1339
@inproceedings{a54f3fd499844627a8e3120941c6ef8e,
title = "On boosting semantic street scene segmentation with weak supervision",
abstract = "Training convolutional networks for semantic segmentation requires per-pixel ground truth labels, which are very time consuming and hence costly to obtain. Therefore, in this work, we research and develop a hierarchical deep network architecture and the corresponding loss for semantic segmentation that can be trained from weak supervision, such as bounding boxes or image level labels, as well as from strong per-pixel supervision. We demonstrate that the hierarchical structure and the simultaneous training on strong (per-pixel) and weak (bounding boxes) labels, even from separate datasets, constantly increases the performance against per-pixel only training. Moreover, we explore the more challenging case of adding weak image-level labels. We collect street scene images and weak labels from the immense Open Images dataset to generate the OpenScapes dataset, and we use this novel dataset to increase segmentation performance on two established per-pixel labeled datasets, Cityscapes and Vistas. We report performance gains up to +13.2{\%} mIoU on crucial street scene classes, and inference speed of 20 fps on a Titan V GPU for Cityscapes at 512 x 1024 resolution. Our network and OpenScapes dataset are shared with the research community.",
author = "Panagiotis Meletis and Gijs Dubbelman",
year = "2019",
doi = "10.1109/IVS.2019.8814217",
language = "English",
pages = "1334--1339",
booktitle = "2019 IEEE Intelligent Vehicles Symposium, IV 2019",
publisher = "Institute of Electrical and Electronics Engineers",
address = "United States",

}

Meletis, P & Dubbelman, G 2019, On boosting semantic street scene segmentation with weak supervision. in 2019 IEEE Intelligent Vehicles Symposium, IV 2019. Institute of Electrical and Electronics Engineers, Piscataway, pp. 1334-1339, 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9/06/19. https://doi.org/10.1109/IVS.2019.8814217

On boosting semantic street scene segmentation with weak supervision. / Meletis, Panagiotis; Dubbelman, Gijs.

2019 IEEE Intelligent Vehicles Symposium, IV 2019. Piscataway : Institute of Electrical and Electronics Engineers, 2019. p. 1334-1339.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - On boosting semantic street scene segmentation with weak supervision

AU - Meletis, Panagiotis

AU - Dubbelman, Gijs

PY - 2019

Y1 - 2019

N2 - Training convolutional networks for semantic segmentation requires per-pixel ground truth labels, which are very time consuming and hence costly to obtain. Therefore, in this work, we research and develop a hierarchical deep network architecture and the corresponding loss for semantic segmentation that can be trained from weak supervision, such as bounding boxes or image level labels, as well as from strong per-pixel supervision. We demonstrate that the hierarchical structure and the simultaneous training on strong (per-pixel) and weak (bounding boxes) labels, even from separate datasets, constantly increases the performance against per-pixel only training. Moreover, we explore the more challenging case of adding weak image-level labels. We collect street scene images and weak labels from the immense Open Images dataset to generate the OpenScapes dataset, and we use this novel dataset to increase segmentation performance on two established per-pixel labeled datasets, Cityscapes and Vistas. We report performance gains up to +13.2% mIoU on crucial street scene classes, and inference speed of 20 fps on a Titan V GPU for Cityscapes at 512 x 1024 resolution. Our network and OpenScapes dataset are shared with the research community.

AB - Training convolutional networks for semantic segmentation requires per-pixel ground truth labels, which are very time consuming and hence costly to obtain. Therefore, in this work, we research and develop a hierarchical deep network architecture and the corresponding loss for semantic segmentation that can be trained from weak supervision, such as bounding boxes or image level labels, as well as from strong per-pixel supervision. We demonstrate that the hierarchical structure and the simultaneous training on strong (per-pixel) and weak (bounding boxes) labels, even from separate datasets, constantly increases the performance against per-pixel only training. Moreover, we explore the more challenging case of adding weak image-level labels. We collect street scene images and weak labels from the immense Open Images dataset to generate the OpenScapes dataset, and we use this novel dataset to increase segmentation performance on two established per-pixel labeled datasets, Cityscapes and Vistas. We report performance gains up to +13.2% mIoU on crucial street scene classes, and inference speed of 20 fps on a Titan V GPU for Cityscapes at 512 x 1024 resolution. Our network and OpenScapes dataset are shared with the research community.

U2 - 10.1109/IVS.2019.8814217

DO - 10.1109/IVS.2019.8814217

M3 - Conference contribution

SP - 1334

EP - 1339

BT - 2019 IEEE Intelligent Vehicles Symposium, IV 2019

PB - Institute of Electrical and Electronics Engineers

CY - Piscataway

ER -

Meletis P, Dubbelman G. On boosting semantic street scene segmentation with weak supervision. In 2019 IEEE Intelligent Vehicles Symposium, IV 2019. Piscataway: Institute of Electrical and Electronics Engineers. 2019. p. 1334-1339 https://doi.org/10.1109/IVS.2019.8814217