TY - JOUR
T1 - Real-time small-object change detection from ground vehicles using a siamese convolutional neural network
AU - Klomp, Sander R.
AU - van de Wouw, Dennis W.J.M.
AU - de With, Peter H.N.
PY - 2019/11
Y1 - 2019/11
N2 - Detecting changes in an uncontrolled environment using cameras mounted on a ground vehicle is critical for the detection of roadside Improvised Explosive Devices (IEDs). Hidden IEDs are often accompanied by visible markers, whose appearances are a priori unknown. Little work has been published on detecting unknown objects using deep learning. This article shows the feasibility of applying convolutional neural networks (CNNs) to predict the location of markers in real time, compared to an earlier reference recording. The authors investigate novel encoder–decoder Siamese CNN architectures and introduce a modified double-margin contrastive loss function, to achieve pixel-level change detection results. Their dataset consists of seven pairs of challenging real-world recordings, and they investigate augmentation with artificial object data. The proposed network architecture can compare two images of 1920 × 1440 pixels in 27 ms on an RTX Titan GPU and significantly outperforms state-of-the-art networks and algorithms on our dataset in terms of F-1 score by 0.28.
AB - Detecting changes in an uncontrolled environment using cameras mounted on a ground vehicle is critical for the detection of roadside Improvised Explosive Devices (IEDs). Hidden IEDs are often accompanied by visible markers, whose appearances are a priori unknown. Little work has been published on detecting unknown objects using deep learning. This article shows the feasibility of applying convolutional neural networks (CNNs) to predict the location of markers in real time, compared to an earlier reference recording. The authors investigate novel encoder–decoder Siamese CNN architectures and introduce a modified double-margin contrastive loss function, to achieve pixel-level change detection results. Their dataset consists of seven pairs of challenging real-world recordings, and they investigate augmentation with artificial object data. The proposed network architecture can compare two images of 1920 × 1440 pixels in 27 ms on an RTX Titan GPU and significantly outperforms state-of-the-art networks and algorithms on our dataset in terms of F-1 score by 0.28.
UR - http://www.scopus.com/inward/record.url?scp=85078922292&partnerID=8YFLogxK
U2 - 10.2352/J.ImagingSci.Technol.2019.63.6.060402
DO - 10.2352/J.ImagingSci.Technol.2019.63.6.060402
M3 - Article
AN - SCOPUS:85078922292
VL - 63
JO - Journal of Imaging Science and Technology
JF - Journal of Imaging Science and Technology
SN - 1062-3701
IS - 6
M1 - 060402
ER -