Data selection for training semantic segmentation CNNs with cross-dataset weak supervision

Panagiotis Meletis, R.R.F.M. Romijnders, Gijs Dubbelman

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review


Training convolutional networks for semantic segmentation with strong (per-pixel) and weak (per-bounding-box) supervision requires a large amount of weakly labeled data. We propose two methods for selecting the most relevant data with weak supervision. The first method is designed for finding visually similar images without the need of labels and is based on modeling image representations with a Gaussian Mixture Model (GMM). As a byproduct of GMM modeling, we present useful insights on characterizing the data generating distribution. The second method aims at finding images with high object diversity and requires only the bounding box labels. Both methods are developed in the context of automated driving and experimentation is conducted on Cityscapes and Open Images datasets. We demonstrate performance gains by reducing the amount of employed weakly labeled images up to 100 times for Open Images and up to 20 times for Cityscapes.
Original languageEnglish
Title of host publication2019 IEEE Intelligent Transportation Systems Conference (ITSC)
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers
Number of pages7
ISBN (Electronic)978-1-5386-7024-8
Publication statusPublished - Oct 2019
EventIEEE Intelligent Transportation Systems Conference (ITSC2019) - Auckland, New Zealand
Duration: 27 Oct 201930 Oct 2019


ConferenceIEEE Intelligent Transportation Systems Conference (ITSC2019)
Abbreviated titleITSC2019
CountryNew Zealand

Fingerprint Dive into the research topics of 'Data selection for training semantic segmentation CNNs with cross-dataset weak supervision'. Together they form a unique fingerprint.

Cite this