There has been an increasing interest on the analysis of First Person Videos in the last few years due to the spread of low-cost wearable devices. Nevertheless, the understanding of the environment surrounding the wearer is a difficult task with many elements involved. In this work, a method for detecting and mapping the presence of people and crowds around the wearer is presented. Features extracted at the crowd level are used for building a robust representation that can handle the variations and occlusion of people’s visual characteristics inside a crowd. To this aim, convolutional neural networks have been exploited. Results demonstrate that this approach achieves a high accuracy on the recognition of crowds, as well as the possibility of a general interpretation of the context trough the classification of characteristics of the segmented background.
|Title of host publication||Advances in Computational Intelligence : 13th International Work-Conference on Artificial Neural Networks, IWANN 2015, Palma de Mallorca, Spain, June 10-12, 2015. Proceedings, Part I|
|Editors||I. Rojas, G. Joya|
|Place of Publication||Berlin|
|Publication status||Published - 2015|
|Name||Lecture Notes in Computer Science|