TY - JOUR
T1 - A hierarchical approach for associating body-worn sensors to video regions in crowded Mingling scenarios
AU - Cabrera-Quiros, Laura
AU - Hung, Hayley
PY - 2019/7/1
Y1 - 2019/7/1
N2 - We address the complex problem of associating several wearable devices with the spatio-temporal region of their wearers in video during crowded mingling events using only acceleration and proximity. This is a particularly important first step for multisensor behavior analysis using video and wearable technologies, where the privacy of the participants must be maintained. Most state-of-the-art works using these two modalities perform their association manually, which becomes practically unfeasible as the number of people in the scene increases. We proposed an automatic association method based on a hierarchical linear assignment optimization, which exploits the spatial context of the scene. Moreover, we present extensive experiments on matching from 2 to more than 69 acceleration and video streams, showing significant improvements over a random baseline in a real-world crowded mingling scenario. We also show the effectiveness of our method for incomplete or missing streams (up to a certain limit) and analyze the tradeoff between length of the streams and number of participants. Finally, we provide an analysis of failure cases, showing that deep understanding of the social actions within the context of the event is necessary to further improve performance on this intriguing task.
AB - We address the complex problem of associating several wearable devices with the spatio-temporal region of their wearers in video during crowded mingling events using only acceleration and proximity. This is a particularly important first step for multisensor behavior analysis using video and wearable technologies, where the privacy of the participants must be maintained. Most state-of-the-art works using these two modalities perform their association manually, which becomes practically unfeasible as the number of people in the scene increases. We proposed an automatic association method based on a hierarchical linear assignment optimization, which exploits the spatial context of the scene. Moreover, we present extensive experiments on matching from 2 to more than 69 acceleration and video streams, showing significant improvements over a random baseline in a real-world crowded mingling scenario. We also show the effectiveness of our method for incomplete or missing streams (up to a certain limit) and analyze the tradeoff between length of the streams and number of participants. Finally, we provide an analysis of failure cases, showing that deep understanding of the social actions within the context of the event is necessary to further improve performance on this intriguing task.
KW - acceleration
KW - association
KW - computer vision
KW - Mingling
KW - wearable sensor
UR - http://www.scopus.com/inward/record.url?scp=85058887309&partnerID=8YFLogxK
U2 - 10.1109/TMM.2018.2888798
DO - 10.1109/TMM.2018.2888798
M3 - Article
AN - SCOPUS:85058887309
SN - 1520-9210
VL - 21
SP - 1867
EP - 1879
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 7
M1 - 8584113
ER -