Abstract
Automatic natural scene understanding and annotating regions with semantically meaningful labels, such as road or sky, are key aspects of image and video analysis. The annotation of regions is a considered helpful for improving the object-of-interest detection because the object position in the scene is also exploited. For a reliable model of a scene and associated context information, the labeling task involves image analysis at multiple, both global and local, scene levels. In this paper, we develop a general framework for performing automatic semantic labeling of video scenes by combining the local features and spatial contextual cues. While maintaining a high accuracy, we pursue an algorithm with low computational complexity, so that it is suitable for real-time implementation in embedded video surveillance. We apply our approach to a complex surveillance use case and to three different datasets: WaterVisie [1], LabelMe [2] and our own dataset. We show that our method quantitatively and qualitatively outperforms two sate-of-the-art approaches [3][4].
Original language | English |
---|---|
Title of host publication | Proceedings 2014 International Conference on Information Science, Electronics and Electrical Engineering (ISEEE), 26-28 april 2014, Sapporo, Japan |
Place of Publication | Piscataway |
Publisher | Institute of Electrical and Electronics Engineers |
Volume | 3 |
ISBN (Print) | 978-1-4799-3196-5 |
DOIs | |
Publication status | Published - 2014 |
Event | conference; ISEEE; 2014-04-26; 2014-04-28 - Duration: 26 Apr 2014 → 28 Apr 2014 |
Conference
Conference | conference; ISEEE; 2014-04-26; 2014-04-28 |
---|---|
Period | 26/04/14 → 28/04/14 |
Other | ISEEE |