Bitwise neural network acceleration: opportunities and challenges

Michel van Lier, Luc Waeijen, Henk Corporaal

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

1 Downloads (Pure)

Abstract

Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would enable many interesting applications. However these CNNs are computation and data expensive, making it difficult to execute them in real-time on energy constrained embedded platforms. Resent research has shown that light-weight CNNs with quantized model weights and activations constrained to one bit only {-1,+ 1} can still achieve reasonable accuracy, in comparison to the non quantized 32-bit model. These binary neural networks (BNNs) theoretically allow to drastically reduce the required energy and run-time by reduction of memory size, number of memory accesses, and finally computation power by replacing expensive two's complement arithmetic operations with more efficient bitwise versions. To make use of these advantages, we propose a bitwise CNN accelerator (BNNA) mapped on an FPGA. We implement the Hubara'16 network [1] on the Xilinx Zynq-7020 SoC. Massive parallelism is achieved performing 4608 parallel binary MACs in total, which enables us to archive real-time speed up to 110 fps, while using only 22% of the FPGA LUTs. In comparison to a 32-bit network, a speed up of 32 times is achieved, and a resource reduction of 40 times is achieved, where the memory bandwidth is the main bottleneck. The provided detailed analysis of the carefully crafted accelerator design exposes the challenges and opportunities in bitwise neural network accelerator design.

Original languageEnglish
Title of host publication2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings
EditorsRadovan Stojanovic, Lech Jozwiak, Budimir Lutovac, Drazen Jurisic
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers
Number of pages5
ISBN (Electronic)978-1-7281-1740-9
DOIs
Publication statusPublished - 1 Jun 2019
Event8th Mediterranean Conference on Embedded Computing, MECO 2019 - Budva, Montenegro
Duration: 10 Jun 201914 Jun 2019

Conference

Conference8th Mediterranean Conference on Embedded Computing, MECO 2019
CountryMontenegro
CityBudva
Period10/06/1914/06/19

Fingerprint

Neural Networks
Neural networks
Accelerator
Particle accelerators
Real-time
Data storage equipment
Weights and Measures
Field Programmable Gate Array
accelerators
Field programmable gate arrays (FPGA)
Speedup
Binary
Energy
systems-on-a-chip
Embedded systems
Embedded Systems
Parallelism
Activation
Complement
Chemical activation

Cite this

van Lier, M., Waeijen, L., & Corporaal, H. (2019). Bitwise neural network acceleration: opportunities and challenges. In R. Stojanovic, L. Jozwiak, B. Lutovac, & D. Jurisic (Eds.), 2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings [8760178] Piscataway: Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/MECO.2019.8760178
van Lier, Michel ; Waeijen, Luc ; Corporaal, Henk. / Bitwise neural network acceleration : opportunities and challenges. 2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings. editor / Radovan Stojanovic ; Lech Jozwiak ; Budimir Lutovac ; Drazen Jurisic. Piscataway : Institute of Electrical and Electronics Engineers, 2019.
@inproceedings{f711fb9392a54b82aee84b3d6b6634eb,
title = "Bitwise neural network acceleration: opportunities and challenges",
abstract = "Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would enable many interesting applications. However these CNNs are computation and data expensive, making it difficult to execute them in real-time on energy constrained embedded platforms. Resent research has shown that light-weight CNNs with quantized model weights and activations constrained to one bit only {-1,+ 1} can still achieve reasonable accuracy, in comparison to the non quantized 32-bit model. These binary neural networks (BNNs) theoretically allow to drastically reduce the required energy and run-time by reduction of memory size, number of memory accesses, and finally computation power by replacing expensive two's complement arithmetic operations with more efficient bitwise versions. To make use of these advantages, we propose a bitwise CNN accelerator (BNNA) mapped on an FPGA. We implement the Hubara'16 network [1] on the Xilinx Zynq-7020 SoC. Massive parallelism is achieved performing 4608 parallel binary MACs in total, which enables us to archive real-time speed up to 110 fps, while using only 22{\%} of the FPGA LUTs. In comparison to a 32-bit network, a speed up of 32 times is achieved, and a resource reduction of 40 times is achieved, where the memory bandwidth is the main bottleneck. The provided detailed analysis of the carefully crafted accelerator design exposes the challenges and opportunities in bitwise neural network accelerator design.",
author = "{van Lier}, Michel and Luc Waeijen and Henk Corporaal",
year = "2019",
month = "6",
day = "1",
doi = "10.1109/MECO.2019.8760178",
language = "English",
editor = "Radovan Stojanovic and Lech Jozwiak and Budimir Lutovac and Drazen Jurisic",
booktitle = "2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers",
address = "United States",

}

van Lier, M, Waeijen, L & Corporaal, H 2019, Bitwise neural network acceleration: opportunities and challenges. in R Stojanovic, L Jozwiak, B Lutovac & D Jurisic (eds), 2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings., 8760178, Institute of Electrical and Electronics Engineers, Piscataway, 8th Mediterranean Conference on Embedded Computing, MECO 2019, Budva, Montenegro, 10/06/19. https://doi.org/10.1109/MECO.2019.8760178

Bitwise neural network acceleration : opportunities and challenges. / van Lier, Michel; Waeijen, Luc; Corporaal, Henk.

2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings. ed. / Radovan Stojanovic; Lech Jozwiak; Budimir Lutovac; Drazen Jurisic. Piscataway : Institute of Electrical and Electronics Engineers, 2019. 8760178.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Bitwise neural network acceleration

T2 - opportunities and challenges

AU - van Lier, Michel

AU - Waeijen, Luc

AU - Corporaal, Henk

PY - 2019/6/1

Y1 - 2019/6/1

N2 - Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would enable many interesting applications. However these CNNs are computation and data expensive, making it difficult to execute them in real-time on energy constrained embedded platforms. Resent research has shown that light-weight CNNs with quantized model weights and activations constrained to one bit only {-1,+ 1} can still achieve reasonable accuracy, in comparison to the non quantized 32-bit model. These binary neural networks (BNNs) theoretically allow to drastically reduce the required energy and run-time by reduction of memory size, number of memory accesses, and finally computation power by replacing expensive two's complement arithmetic operations with more efficient bitwise versions. To make use of these advantages, we propose a bitwise CNN accelerator (BNNA) mapped on an FPGA. We implement the Hubara'16 network [1] on the Xilinx Zynq-7020 SoC. Massive parallelism is achieved performing 4608 parallel binary MACs in total, which enables us to archive real-time speed up to 110 fps, while using only 22% of the FPGA LUTs. In comparison to a 32-bit network, a speed up of 32 times is achieved, and a resource reduction of 40 times is achieved, where the memory bandwidth is the main bottleneck. The provided detailed analysis of the carefully crafted accelerator design exposes the challenges and opportunities in bitwise neural network accelerator design.

AB - Real-time inference of deep convolutional neural networks (CNNs) on embedded systems and SoCs would enable many interesting applications. However these CNNs are computation and data expensive, making it difficult to execute them in real-time on energy constrained embedded platforms. Resent research has shown that light-weight CNNs with quantized model weights and activations constrained to one bit only {-1,+ 1} can still achieve reasonable accuracy, in comparison to the non quantized 32-bit model. These binary neural networks (BNNs) theoretically allow to drastically reduce the required energy and run-time by reduction of memory size, number of memory accesses, and finally computation power by replacing expensive two's complement arithmetic operations with more efficient bitwise versions. To make use of these advantages, we propose a bitwise CNN accelerator (BNNA) mapped on an FPGA. We implement the Hubara'16 network [1] on the Xilinx Zynq-7020 SoC. Massive parallelism is achieved performing 4608 parallel binary MACs in total, which enables us to archive real-time speed up to 110 fps, while using only 22% of the FPGA LUTs. In comparison to a 32-bit network, a speed up of 32 times is achieved, and a resource reduction of 40 times is achieved, where the memory bandwidth is the main bottleneck. The provided detailed analysis of the carefully crafted accelerator design exposes the challenges and opportunities in bitwise neural network accelerator design.

UR - http://www.scopus.com/inward/record.url?scp=85073900868&partnerID=8YFLogxK

U2 - 10.1109/MECO.2019.8760178

DO - 10.1109/MECO.2019.8760178

M3 - Conference contribution

AN - SCOPUS:85073900868

BT - 2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings

A2 - Stojanovic, Radovan

A2 - Jozwiak, Lech

A2 - Lutovac, Budimir

A2 - Jurisic, Drazen

PB - Institute of Electrical and Electronics Engineers

CY - Piscataway

ER -

van Lier M, Waeijen L, Corporaal H. Bitwise neural network acceleration: opportunities and challenges. In Stojanovic R, Jozwiak L, Lutovac B, Jurisic D, editors, 2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings. Piscataway: Institute of Electrical and Electronics Engineers. 2019. 8760178 https://doi.org/10.1109/MECO.2019.8760178