The inherent capability of wide-SIMD architectures to exploit data level parallelism enables a high energy efficiency. For scalability and power reasons, wide-SIMDs typically have limited connectivity between the processing elements. This makes it challenging to map algorithms that require complex communication patterns. In this work we propose a novel algorithm to efficiently map the often encountered reduction operation to a wide-SIMD with limited connectivity.
|Publication status||Published - 2013|
|Event||ICT.OPEN 2013 - Van der Valk Hotel, Eindhoven, Netherlands|
Duration: 27 Nov 2013 → 28 Nov 2013
|Period||27/11/13 → 28/11/13|
|Other||The Interface for Dutch ICT-Research|