TY - GEN
T1 - System Simulation of Memristor Based Computation in Memory Platforms
AU - BanaGozar, Ali
AU - Vadivel, Kanishkan
AU - Multanen, Joonas
AU - Jääskeläinen, Pekka
AU - Stuijk, Sander
AU - Corporaal, Henk
PY - 2020
Y1 - 2020
N2 - Processors based on the von Neumann architecture show inefficient performance on many emerging data-intensive workloads. Computation in-memory (CIM) tries to address this challenge by performing the computation on the data location. To realize CIM, memristors, that are deployed in a crossbar structure, are a promising candidate. Even though extensive research has been carried out on memristors at device/circuit-level, the implications of their integration as accelerators (CIM units) in a full-blown system are not studied extensively. To study that, we developed a simulator for memristor crossbar and its analog peripheries. This paper evaluates a complete system consisting of a Transport Triggered Architecture (TTA) based host core integrating one or more CIM units. This evaluation is based on a cycle-accurate simulation. For this purpose we designed a simulator which a) includes the memristor crossbar operations as well as its surrounding analog drivers, b) provides the required interface to the co-processing digital elements, and c) presents a micro-instruction set architecture (micro-ISA) that controls and operates both analog and digital components. It is used to assess the effectiveness of the CIM unit in terms of performance, energy, and area in a full-blown system. It is shown, for example, that the EDAP for the deep learning application, LeNet, is reduced by 84% in a full-blown system deploying memristor based crossbars.
AB - Processors based on the von Neumann architecture show inefficient performance on many emerging data-intensive workloads. Computation in-memory (CIM) tries to address this challenge by performing the computation on the data location. To realize CIM, memristors, that are deployed in a crossbar structure, are a promising candidate. Even though extensive research has been carried out on memristors at device/circuit-level, the implications of their integration as accelerators (CIM units) in a full-blown system are not studied extensively. To study that, we developed a simulator for memristor crossbar and its analog peripheries. This paper evaluates a complete system consisting of a Transport Triggered Architecture (TTA) based host core integrating one or more CIM units. This evaluation is based on a cycle-accurate simulation. For this purpose we designed a simulator which a) includes the memristor crossbar operations as well as its surrounding analog drivers, b) provides the required interface to the co-processing digital elements, and c) presents a micro-instruction set architecture (micro-ISA) that controls and operates both analog and digital components. It is used to assess the effectiveness of the CIM unit in terms of performance, energy, and area in a full-blown system. It is shown, for example, that the EDAP for the deep learning application, LeNet, is reduced by 84% in a full-blown system deploying memristor based crossbars.
KW - Computation in memory
KW - Memristor
KW - Non-von neumann architectures
KW - Simulator
KW - Transport Triggered Architecture
UR - http://www.scopus.com/inward/record.url?scp=85093857229&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-60939-9_11
DO - 10.1007/978-3-030-60939-9_11
M3 - Conference contribution
AN - SCOPUS:85093857229
SN - 9783030609382
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 152
EP - 168
BT - Embedded Computer Systems
A2 - Orailoglu, Alex
A2 - Jung, Matthias
A2 - Reichenbach, Marc
PB - Springer Science and Business Media Deutschland GmbH
T2 - 20th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 2020
Y2 - 5 July 2020 through 9 July 2020
ER -