A data-reuse aware accelerator for large-scale convolutional networks

M.C.J. Peemen, B. Mesman, H. Corporaal

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademic

178 Downloads (Pure)


This paper presents a clustered SIMD accelerator template for Convolutional Networks. These networks significantly outperform other methods in detection and classification tasks in the vision domain. Due to the excessive compute and data transfer requirements these applications benefit a lot from a dedicated accelerator. The proposed accelerator reduces memory traffic by loop transformations such as tiling and fusion to merge successive layers. Although fusion can introduce redundant computations it often reduces the data transfer, and therefore can remove performance bottlenecks. The SIMD cluster is mapped to a Xilinx Zynq FPGA, which can achieve 6.4 Gops performance with a small amount of resources. The performance can be scaled by using multiple clusters.
Originele taal-2Engels
TitelWorkshop on Neuromorphic Architectures (NeuroArch), 14 June 2014, Minneapolis, Minnesota
StatusGepubliceerd - 2014


Duik in de onderzoeksthema's van 'A data-reuse aware accelerator for large-scale convolutional networks'. Samen vormen ze een unieke vingerafdruk.

Citeer dit