Abstract
On the trend of ultrasound-based gesture recognition, this study introduces the concept of time-sequence classification of ultrasonic patterns induced by hand movements on a
microphone array. We refer to time-sequence ultrasound echoes
as continuous frequency patterns being received in real-time at
different steering angles. The ultrasound source is a single tone
continuously being emitted from the center of the microphone
array. In the interim, the array beamforms and locates an
ultrasonic activity (induced echoes) after which a processing
pipeline is initiated to extract band-limited frequency features.
These beamformed features are organized in a 2D matrix of size
11 × 30 updated every 10ms on which a Temporal Convolutional
Network (TCN) outputs continuous classification. Prior to that,
the same TCN is trained to classify Doppler shift variability
rate. Using this approach, we show that a user can easily achieve
49 gestures at different steering angles by means of sequence
detection. To make it simple to users, we define two Doppler
shift variability rates; very slow and very fast which the TCN
detects 95-99% of the time. Not only a gesture can be performed
at different directions but also the length of each performed
gesture can be measured. This leverages the diversity of inair ultrasonic gestures allowing more control capabilities. The
process is designed under low-resource settings; that is, given
the fact that this real-time process is always-on, the power and
memory resources should be optimized. The proposed solution
needs 6:2 − 10:2 MMACs and a memory footprint of 6KB
allowing such gesture recognition system to be hosted by energyconstrained edge devices such as smart-speakers.
microphone array. We refer to time-sequence ultrasound echoes
as continuous frequency patterns being received in real-time at
different steering angles. The ultrasound source is a single tone
continuously being emitted from the center of the microphone
array. In the interim, the array beamforms and locates an
ultrasonic activity (induced echoes) after which a processing
pipeline is initiated to extract band-limited frequency features.
These beamformed features are organized in a 2D matrix of size
11 × 30 updated every 10ms on which a Temporal Convolutional
Network (TCN) outputs continuous classification. Prior to that,
the same TCN is trained to classify Doppler shift variability
rate. Using this approach, we show that a user can easily achieve
49 gestures at different steering angles by means of sequence
detection. To make it simple to users, we define two Doppler
shift variability rates; very slow and very fast which the TCN
detects 95-99% of the time. Not only a gesture can be performed
at different directions but also the length of each performed
gesture can be measured. This leverages the diversity of inair ultrasonic gestures allowing more control capabilities. The
process is designed under low-resource settings; that is, given
the fact that this real-time process is always-on, the power and
memory resources should be optimized. The proposed solution
needs 6:2 − 10:2 MMACs and a memory footprint of 6KB
allowing such gesture recognition system to be hosted by energyconstrained edge devices such as smart-speakers.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020 |
Editors | Giorgio Di Natale, Cristiana Bolchini, Elena-Ioana Vatajelu |
Place of Publication | Piscataway |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 1259-1264 |
Number of pages | 6 |
ISBN (Electronic) | 978-3-9819263-4-7 |
DOIs | |
Publication status | Published - Mar 2020 |
Event | 23rd Design, Automation and Test in Europe Conference and Exhibition (DATE 2020) - Grenoble, France Duration: 9 Mar 2020 → 13 Mar 2020 Conference number: 23 |
Conference
Conference | 23rd Design, Automation and Test in Europe Conference and Exhibition (DATE 2020) |
---|---|
Abbreviated title | DATE 2020 |
Country/Territory | France |
City | Grenoble |
Period | 9/03/20 → 13/03/20 |
Keywords
- Doppler shift
- Edge Devices
- Gesture Recognition
- Human System Interaction (HSI)
- Temporal Convolutional Networks (TCN)