During the last two decades, Single Instruction Multiple Data (SIMD) processors have become important architectures in embedded systems for image processing applications. The main reasons are their area and energy efficiency. Often the processing elements (PEs) of an SIMD processor are only locally connected. This may result in a communication bottleneck (only access to direct neighbors). One way to solve this is to use a fully connected communication network (FC-SIMD) between PEs. However, this solution leads to an excessive communication area cost, low communication network utilization, and scalability problems. E.g., the area overhead of an FC-SIMD is more than 100% when the number of PEs gets bigger than 64. In this paper, we introduce a new type of SIMD architecture, called RC-SIMD, with a reconfigurable communication network. It uses a delay-line in the instruction bus, causing the accesses to the communication network to be distributed over time. This architecture requires only a very cheap communication network while performing almost the same as expensive FC-SIMD architectures. However, the new architecture causes irregular resource conflicts. We therefore introduce a conflict model that existing schedulers are able to cope with. Experimental results show that, on average (compared to locally connected SIMDs), RC-SIMD require 21% fewer cycles than architecture without the delay-line, while the area overhead is at most 10%.
|Number of pages||13|
|Journal||Journal of Embedded Computing|
|Publication status||Published - 2006|