TY - GEN
T1 - QMTS
T2 - 32nd International Conference on Artificial Neural Networks, ICANN 2023
AU - Eissa, Sherif
AU - Corradi, Federico
AU - de Putter, Floran
AU - Stuijk, Sander
AU - Corporaal, Henk
PY - 2023/9/22
Y1 - 2023/9/22
N2 - Spiking Neural Networks (SNNs) represent a promising solution for streaming applications at the edge that have strict performance and energy requirements. However, implementing SNNs efficiently at the edge requires model quantization to reduce memory and compute requirements. In this paper, we provide methods to quantize a prominent neuron model for temporally rich problems, the parameterized Adaptive Leaky-Integrate-and-Fire (p-ALIF). p-ALIF neurons combine the computational simplicity of Integrate-and-Fire neurons, with accurate learning at multiple timescales, activation sparsity, and increased dynamic range, due to adaptation and heterogeneity. p-ALIF neurons have shown state-of-the-art (SoTA) performance on temporal tasks such as speech recognition and health monitoring. Our method, QMTS, separates SNN quantization into two stages, allowing one to explore different quantization levels efficiently. QMTS search heuristics are tailored for leaky heterogeneous neurons. We demonstrate QMTS on several temporal benchmarks, showing up to 40x memory reduction and 4x sparser synaptic operations with little accuracy loss, compared to 32-bit float.
AB - Spiking Neural Networks (SNNs) represent a promising solution for streaming applications at the edge that have strict performance and energy requirements. However, implementing SNNs efficiently at the edge requires model quantization to reduce memory and compute requirements. In this paper, we provide methods to quantize a prominent neuron model for temporally rich problems, the parameterized Adaptive Leaky-Integrate-and-Fire (p-ALIF). p-ALIF neurons combine the computational simplicity of Integrate-and-Fire neurons, with accurate learning at multiple timescales, activation sparsity, and increased dynamic range, due to adaptation and heterogeneity. p-ALIF neurons have shown state-of-the-art (SoTA) performance on temporal tasks such as speech recognition and health monitoring. Our method, QMTS, separates SNN quantization into two stages, allowing one to explore different quantization levels efficiently. QMTS search heuristics are tailored for leaky heterogeneous neurons. We demonstrate QMTS on several temporal benchmarks, showing up to 40x memory reduction and 4x sparser synaptic operations with little accuracy loss, compared to 32-bit float.
KW - neuromorphic computing
KW - quantization
KW - spiking neural networks
UR - http://www.scopus.com/inward/record.url?scp=85174606351&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-44207-0_34
DO - 10.1007/978-3-031-44207-0_34
M3 - Conference contribution
AN - SCOPUS:85174606351
SN - 978-3-031-44206-3
T3 - Lecture Notes in Computer Science (LNCS)
SP - 407
EP - 419
BT - Artificial Neural Networks and Machine Learning – ICANN 2023
A2 - Iliadis, Lazaros
A2 - Papaleonidas, Antonios
A2 - Angelov, Plamen
A2 - Jayne, Chrisina
PB - Springer
CY - Cham
Y2 - 26 September 2023 through 29 September 2023
ER -